Class Ensembl::Core::Slice
In: lib/ensembl/core/project.rb
lib/ensembl/core/slice.rb
Parent: Object

DESCRIPTION

From the perl API tutorial (www.ensembl.org/info/software/core/core_tutorial.html): "A Slice object represents a continuous region of a genome. Slices can be used to obtain sequence, features or other information from a particular region of interest."

In contrast to almost all other classes of Ensembl::Core, the Slice class is not based on ActiveRecord.

USAGE

 chr4 = SeqRegion.find_by_name('4')
 my_slice = Slice.new(chr4, 95000, 98000, -1)
 puts my_slice.display_name    #--> 'chromosome:4:Btau_3.1:95000:98000:1'

Methods

Attributes

seq  [RW] 
seq_region  [RW] 
start  [RW] 
stop  [RW] 
strand  [RW] 

Public Class methods

DESCRIPTION

Create an array of all Slices for a given coordinate system.

USAGE

 slices = Slice.fetch_all('chromosome')

Arguments:

  • coord_system_name:: name of coordinate system (default = chromosome)
  • coord_system_version:: version of coordinate system (default = nil)
Returns:an array of Ensembl::Core::Slice objects

[Source]

     # File lib/ensembl/core/slice.rb, line 146
146:       def self.fetch_all(coord_system_name = 'chromosome', version = nil)
147:         answer = Array.new
148:         if version.nil?
149:           coord_system = Ensembl::Core::CoordSystem.find_by_name(coord_system_name)
150:         else
151:           coord_system = Ensembl::Core::CoordSystem.find_by_name_and_version(coord_system_name, version)
152:         end
153: 
154:         coord_system.seq_regions.each do |seq_region|
155:           answer.push(Ensembl::Core::Slice.new(seq_region))
156:         end
157: 
158:         return answer
159:       end

DESCRIPTION

Create a Slice based on a Gene

USAGE

 my_slice = Slice.fetch_by_gene_stable_id('ENSG00000184895')

Arguments:

  • gene_stable_id: Ensembl gene stable_id (required)
Returns:Ensembl::Core::Slice object

[Source]

     # File lib/ensembl/core/slice.rb, line 109
109:       def self.fetch_by_gene_stable_id(gene_stable_id, flanking_seq_length = 0)
110:         gene_stable_id = Ensembl::Core::GeneStableId.find_by_stable_id(gene_stable_id)
111:         gene = gene_stable_id.gene
112:         seq_region = gene.seq_region
113: 
114:         return Ensembl::Core::Slice.new(seq_region, gene.seq_region_start - flanking_seq_length, gene.seq_region_end + flanking_seq_length, gene.seq_region_strand)
115:       end

DESCRIPTION

Create a Slice without first creating the SeqRegion object.

USAGE

 my_slice_1 = Slice.fetch_by_region('chromosome','4',95000,98000,1)

Arguments:

Returns:Ensembl::Core::Slice object

[Source]

    # File lib/ensembl/core/slice.rb, line 74
74:       def self.fetch_by_region(coord_system_name, seq_region_name, start = nil, stop = nil, strand = 1, version = nil)
75:         all_coord_systems = Ensembl::Core::CoordSystem.find_all_by_name(coord_system_name)
76:         coord_system = nil
77:         if version.nil? # Take the version with the lowest rank

78:           coord_system = all_coord_systems.sort_by{|cs| cs.version}.reverse.shift
79:         else
80:           coord_system = all_coord_systems.select{|cs| cs.version == version}[0]
81:         end
82:         unless coord_system.class == Ensembl::Core::CoordSystem
83:           message = "Couldn't find a Ensembl::Core::CoordSystem object with name '" + coord_system_name + "'"
84:           if ! version.nil?
85:             message += " and version '" + version + "'"
86:           end
87:           raise message
88:         end
89: 
90:         seq_region = Ensembl::Core::SeqRegion.find_by_name_and_coord_system_id(seq_region_name, coord_system.id)
91:         #seq_region = Ensembl::Core::SeqRegion.find_by_sql("SELECT * FROM seq_region WHERE name = '" + seq_region_name + "' AND coord_system_id = " + coord_system.id.to_s)[0]

92:         unless seq_region.class == Ensembl::Core::SeqRegion
93:           raise "Couldn't find a Ensembl::Core::SeqRegion object with the name '" + seq_region_name + "'"
94:         end
95: 
96:         return Ensembl::Core::Slice.new(seq_region, start, stop, strand)
97:       end

DESCRIPTION

Create a Slice based on a Transcript

USAGE

 my_slice = Slice.fetch_by_transcript_stable_id('ENST00000383673')

Arguments:

  • transcript_stable_id: Ensembl transcript stable_id (required)
Returns:Ensembl::Core::Slice object

[Source]

     # File lib/ensembl/core/slice.rb, line 127
127:       def self.fetch_by_transcript_stable_id(transcript_stable_id, flanking_seq_length = 0)
128:         transcript_stable_id = Ensembl::Core::TranscriptStableId.find_by_stable_id(transcript_stable_id)
129:         transcript = transcript_stable_id.transcript
130:         seq_region = transcript.seq_region
131: 
132:         return Ensembl::Core::Slice.new(seq_region, transcript.seq_region_start - flanking_seq_length, transcript.seq_region_end + flanking_seq_length, transcript.seq_region_strand)
133:       end

DESCRIPTION

Create a new Slice object from scratch.

USAGE

 chr4 = SeqRegion.find_by_name('4')
 my_slice = Slice.new(chr4, 95000, 98000, -1)

Arguments:

Returns:Slice object

[Source]

    # File lib/ensembl/core/slice.rb, line 46
46:       def initialize(seq_region, start = 1, stop = seq_region.length, strand = 1)
47:         if start.nil?
48:           start = 1
49:         end
50:         if stop.nil?
51:           stop = seq_region.length
52:         end
53:         unless seq_region.class == Ensembl::Core::SeqRegion
54:           raise 'First argument has to be a Ensembl::Core::SeqRegion object'
55:         end
56:         @seq_region, @start, @stop, @strand = seq_region, start, stop, strand
57:         @seq = nil
58:       end

Public Instance methods

DESCRIPTION

The display_name method returns a full name of this slice, containing the name of the coordinate system, the sequence region, start and stop positions on that sequence region and the strand. E.g. for a slice of bovine chromosome 4 from position 95000 to 98000 on the reverse strand, the display_name would look like: chromosome:4:Btau_3.1:95000:98000:-1

USAGE

 puts my_slice.display_name

Arguments:none
Result:String

[Source]

     # File lib/ensembl/core/slice.rb, line 191
191:       def display_name
192:         return [self.seq_region.coord_system.name, self.seq_region.coord_system.version, self.seq_region.name, self.start.to_s, self.stop.to_s, self.strand.to_s].join(':')
193:       end

DESCRIPTION

Get all DnaAlignFeatures that are located on a Slice for a given Analysis.

Pitfall: just looks at the CoordSystem that the Slice is located on. For example, if a Slice is located on a SeqRegion on the ‘chromosome’ CoordSystem, but all dna_align_features are annotated on SeqRegions of the ‘scaffold’ CoordSystem, this method will return an empty array.

USAGE

 my_slice.dna_align_features('Vertrna').each do |feature|
   puts feature.to_yaml
 end

Arguments:

  • code: name of analysis
Returns:array of DnaAlignFeature objects

[Source]

     # File lib/ensembl/core/slice.rb, line 567
567:       def dna_align_features(analysis_name = nil)
568:         if analysis_name.nil?
569:           return DnaAlignFeature.find_by_sql('SELECT * FROM dna_align_feature WHERE seq_region_id = ' + self.seq_region.id.to_s + ' AND seq_region_start >= ' + self.start.to_s + ' AND seq_region_end <= ' + self.stop.to_s)
570:         else
571:           analysis = Analysis.find_by_logic_name(analysis_name)
572:           return DnaAlignFeature.find_by_sql('SELECT * FROM dna_align_feature WHERE seq_region_id = ' + self.seq_region.id.to_s + ' AND seq_region_start >= ' + self.start.to_s + ' AND seq_region_end <= ' + self.stop.to_s + ' AND analysis_id = ' + analysis.id.to_s)
573:         end
574:       end

DESCRIPTION

The Slice#excise method removes a bit of the slice in the middle and returns the three result slices.

CAUTION: this now only works if the ranges do not overlap

USAGE

 original_slice = Slice.fetch_by_region('chromosome','X',1,10000)
 new_slices = original_slice.excise([500..750, 1050..1075])
 new_slices.each do |s|
   puts s.display_name
 end

 # result:
 #   chromosome:X:1:499:1
 #   chromosome:X:751:1049:1
 #   chromosome:X:1076:10000:1

Arguments:

  • ranges: array of ranges (required)
Returns:array of Slice objects

[Source]

     # File lib/ensembl/core/slice.rb, line 279
279:       def excise(ranges)
280:         if ranges.class != Array
281:           raise RuntimeError, "Argument should be an array of ranges"
282:         end
283:         ranges.each do |r|
284:           if r.class != Range
285:             raise RuntimeError, "Argument should be an array of ranges"
286:           end
287:         end
288: 
289:         answer = Array.new
290:         previous_excised_stop = self.start - 1
291:         ranges.sort_by{|r| r.first}.each do |r|
292:           subslice_start = previous_excised_stop + 1
293:           if subslice_start <= r.first - 1
294:             answer.push(Slice.new(self.seq_region, subslice_start, r.first - 1))
295:           end
296:           previous_excised_stop = r.last
297:           if r.last > self.stop
298:             return answer
299:           end
300:         end
301:         subslice_start = previous_excised_stop + 1
302:         answer.push(Slice.new(self.seq_region, subslice_start, self.stop))
303:         return answer
304:       end

Don‘t use this method yourself.

[Source]

     # File lib/ensembl/core/slice.rb, line 442
442:       def get_objects(target_class, table_name, inclusive = false)
443:         answer = Array.new
444: 
445:         
446:         # Get all the coord_systems with this type of features on them

447:         coord_system_ids_with_features = MetaCoord.find_all_by_table_name(table_name).collect{|mc| mc.coord_system_id}
448: 
449:         # Get the features of the original slice

450:         if coord_system_ids_with_features.include?(self.seq_region.coord_system_id)
451:           sql = ''
452:           if inclusive
453:             sql = "SELECT * FROM \#{table_name}\nWHERE seq_region_id = \#{self.seq_region.id.to_s}\nAND (( seq_region_start BETWEEN \#{self.start.to_s} AND \#{self.stop.to_s} )\nOR   ( seq_region_end BETWEEN \#{self.start.to_s} AND \#{self.stop.to_s} )\nOR   ( seq_region_start <= \#{self.start.to_s} AND seq_region_end >= \#{self.stop.to_s} )\n    )\n"
454:           else
455:             sql = "SELECT * FROM \#{table_name}\nWHERE seq_region_id = \#{self.seq_region.id.to_s}\nAND seq_region_start >= \#{self.start.to_s}\nAND seq_region_end <= \#{self.stop.to_s}   \n"
456:           end
457:           answer.push(target_class.find_by_sql(sql))
458:           coord_system_ids_with_features.delete(self.seq_region.coord_system_id)
459:         end
460: 
461:         # Transform the original slice to other coord systems and get those

462:         # features as well. At the moment, only 'direct' projections can be made.

463:         # Later, I'm hoping to add functionality for following a path from one

464:         # coord_system to another if they're not directly linked in the assembly

465:         # table.

466:         coord_system_ids_with_features.each do |target_coord_system_id|
467:           target_slices = self.project(CoordSystem.find(target_coord_system_id).name)
468:           target_slices.each do |slice|
469:             if slice.class == Slice
470:               if inclusive
471:                 sql = "SELECT * FROM \#{table_name}\nWHERE seq_region_id = \#{slice.seq_region.id.to_s}\nAND (( seq_region_start BETWEEN \#{slice.start.to_s} AND \#{slice.stop.to_s} )\nOR   ( seq_region_end BETWEEN \#{slice.start.to_s} AND \#{slice.stop.to_s} )\nOR   ( seq_region_start <= \#{slice.start.to_s} AND seq_region_end >= \#{slice.stop.to_s} )\n    )\n"
472:               else
473:                 sql = "SELECT * FROM \#{table_name}\nWHERE seq_region_id = \#{slice.seq_region.id.to_s}\nAND seq_region_start >= \#{slice.start.to_s}\nAND seq_region_end <= \#{slice.stop.to_s}   \n"
474:               end 
475:                 answer.push(target_class.find_by_sql(sql))
476:             end
477:           end
478:         end
479: 
480:         answer.flatten!
481:         answer.uniq!
482: 
483:         return answer
484:       end

DESCRIPTION

Get the length of a slice

USAGE

 chr4 = SeqRegion.find_by_name('4')
 my_slice = Slice.new(chr4, 95000, 98000, -1)
 puts my_slice.length

Arguments:none
Returns:Integer

[Source]

     # File lib/ensembl/core/slice.rb, line 175
175:       def length
176:         return self.stop - self.start + 1
177:       end

Don‘t use this method yourself.

[Source]

     # File lib/ensembl/core/slice.rb, line 416
416:       def method_missing(method_name, *args)
417:         table_name = method_name.to_s.singularize
418:         class_name = table_name.camelcase
419: 
420:         # Convert to the class object

421:         target_class = nil
422:         ObjectSpace.each_object(Class) do |o|
423:           if o.name =~ /^Ensembl::Core::#{class_name}$/
424:             target_class = o
425:           end
426:         end
427: 
428:         # If it exists, see if it implements Sliceable

429:         if ! target_class.nil? and target_class.include?(Sliceable)
430:           inclusive = false
431:           if [TrueClass, FalseClass].include?(args[0].class)
432:             inclusive = args[0]
433:           end
434:           return self.get_objects(target_class, table_name, inclusive)
435:         end
436: 
437:         raise NoMethodError
438: 
439:       end

DESCRIPTION

Get all MiscFeatures that are located on a Slice for a given MiscSet.

Pitfall: just looks at the CoordSystem that the Slice is located on. For example, if a Slice is located on a SeqRegion on the ‘chromosome’ CoordSystem, but all misc_features are annotated on SeqRegions of the ‘scaffold’ CoordSystem, this method will return an empty array.

USAGE

 my_slice.misc_features('encode').each do |feature|
   puts feature.to_yaml
 end

Arguments:

Returns:array of MiscFeature objects

[Source]

     # File lib/ensembl/core/slice.rb, line 531
531:       def misc_features(code)
532:         answer = Array.new
533:         if code.nil?
534:           self.seq_region.misc_features.each do |mf|
535:             if mf.seq_region_start > self.start and mf.seq_region_end < self.stop
536:               answer.push(mf)
537:             end
538:           end
539:         else
540:           self.seq_region.misc_features.each do |mf|
541:             if mf.misc_sets[0].code == code
542:               if mf.seq_region_start > self.start and mf.seq_region_end < self.stop
543:                 answer.push(mf)
544:               end
545:             end
546:           end
547:         end
548:         return answer
549:       end

DESCRIPTION

The Slice#overlaps? method checks if this slice overlaps another one. The other slice has to be on the same coordinate system

USAGE

 slice_a = Slice.fetch_by_region('chromosome','X',1,1000)
 slice_b = Slice.fetch_by_region('chromosome','X',900,1500)
 if slice_a.overlaps?(slice_b)
   puts "There slices overlap"
 end

Arguments:another slice
Returns:true or false

[Source]

     # File lib/ensembl/core/slice.rb, line 209
209:       def overlaps?(other_slice)
210:         if ! other_slice.class == Slice
211:           raise RuntimeError, "The Slice#overlaps? method takes a Slice object as its arguments."
212:         end
213:         if self.seq_region.coord_system != other_slice.seq_region.coord_system
214:           raise RuntimeError, "The argument slice of Slice#overlaps? has to be in the same coordinate system, but were " + self.seq_region.coord_system.name + " and " + other_slice.seq_region.coord_system.name
215:         end
216: 
217:         self_range = self.start .. self.stop
218:         other_range = other_slice.start .. other_slice.stop
219: 
220:         if self_range.include?(other_slice.start) or other_range.include?(self.start)
221:           return true
222:         else
223:           return false
224:         end
225:       end

DESCRIPTION

The Slice#project method is used to transfer coordinates from one coordinate system to another. Suppose you have a slice on a contig in human (let‘s say on contig AC000031.6.1.38703) and you want to know the coordinates on the chromosome. This is a projection of coordinates from a higher ranked coordinate system to a lower ranked coordinate system. Projections can also be done from a chromosome to the contig level. However, it might be possible that more than one contig has to be included and that there exist gaps between the contigs. The output of this method therefore is an array of Slice and Gap objects.

At the moment, projections can only be done if the two coordinate systems are linked directly in the ‘assembly’ table.

USAGE

 # Get a contig slice in cow and project to scaffold level
 # (i.e. going from a high rank coord system to a lower rank coord
 # system)
 source_slice = Slice.fetch_by_region('contig', 'AAFC03020247', 42, 2007)
 target_slices = source_slice.project('scaffold')
 puts target_slices.length          #--> 1
 puts target_slices[0].display_name #--> scaffold:ChrUn.003.3522:6570:8535:1

 # Get a chromosome slice in cow and project to scaffold level
 # (i.e. going from a low rank coord system to a higher rank coord
 # system)
 # The region 96652152..98000000 on BTA4 is covered by 2 scaffolds
 # that are separated by a gap.
 source_slice = Slice.fetch_by_region('chromosome','4', 96652152, 98000000)
 target_slices = source_slice.project('scaffold')
 puts target_slices.length          #--> 3
 first_bit, second_bit, third_bit = target_slices
 puts first_bit.display_name        #--> scaffold:Btau_3.1:Chr4.003.105:42:599579:1
 puts second_bit.class              #--> Gap
 puts third_bit.display_name        #--> scaffold:Btau_3.1:Chr4.003.106:1:738311:1

Arguments:

  • coord_system_name:: name of coordinate system to project coordinates to
Returns:an array consisting of Slices and, if necessary, Gaps

[Source]

     # File lib/ensembl/core/project.rb, line 53
 53:       def project(coord_system_name)
 54:         answer = Array.new # an array of slices

 55:         source_coord_system = self.seq_region.coord_system
 56:         target_coord_system = nil
 57:         if coord_system_name == 'toplevel'
 58:           target_coord_system = CoordSystem.find_toplevel
 59:           coord_system_name = target_coord_system.name
 60:         elsif coord_system_name == 'seqlevel'
 61:           target_coord_system = CoordSystem.find_seqlevel
 62:           coord_system_name = target_coord_system.name
 63:         else
 64:           target_coord_system = CoordSystem.find_by_name(coord_system_name)
 65:         end
 66: 
 67:         if target_coord_system.rank < source_coord_system.rank
 68:           # We're going from component to assembly, which is easy.

 69:           assembly_links = self.seq_region.assembly_links_as_component(coord_system_name)
 70:           
 71:           if assembly_links.length == 0
 72:             return []
 73:           else
 74:             assembly_links.each do |assembly_link|
 75:               target_seq_region = assembly_link.asm_seq_region
 76:               target_start = self.start + assembly_link.asm_start - assembly_link.cmp_start
 77:               target_stop = self.stop + assembly_link.asm_start - assembly_link.cmp_start
 78:               target_strand = self.strand * assembly_link.ori # 1x1=>1, 1x-1=>-1, -1x-1=>1

 79:               
 80:               answer.push(Slice.new(target_seq_region, target_start, target_stop, target_strand))
 81:             end
 82:           end
 83:           
 84:         else
 85:           # If we're going from assembly to component, the answer of the target method

 86:           # is an array consisting of Slices intermitted with Gaps.

 87: 
 88:           # ASSEMBLY_EXCEPTIONS

 89:           # CAUTION: there are exceptions to the assembly (stored in the assembly_exception)

 90:           # table which make things a little bit more difficult... For example,

 91:           # in human, the assembly data for the pseudo-autosomal region (PAR) of

 92:           # Y is *not* stored in the assembly table. Instead, there is a record

 93:           # in the assembly_exception table that says: "For chr Y positions 1

 94:           # to 2709520, use chr X:1-2709520 for the assembly data."

 95:           # As a solution, what we'll do here, is split the assembly up in blocks:

 96:           # if a slice covers both the PAR and the allosomal region, we'll make

 97:           # two subslices (let's call them blocks not to intercede with the

 98:           # Slice#subslices method) and project these independently.

 99:           assembly_exceptions = AssemblyException.find_all_by_seq_region_id(self.seq_region.id)
100:           if assembly_exceptions.length > 0
101:             # Check if this bit of the original slice is covered in the

102:             # assembly_exception table.

103:             overlapping_exceptions = Array.new
104:             assembly_exceptions.each do |ae|
105:               if Slice.new(self.seq_region, ae.seq_region_start, ae.seq_region_end).overlaps?(self)
106:                 if ae.exc_type == 'HAP'
107:                   raise NotImplementedError, "The haplotype exceptions are not implemented (yet). You can't project this slice."
108:                 end
109:                 overlapping_exceptions.push(ae)
110:               end
111:             end
112: 
113:             if overlapping_exceptions.length > 0
114:               # First get all assembly blocks from chromosome Y

115:               source_assembly_blocks = self.excise(overlapping_exceptions.collect{|e| e.seq_region_start .. e.seq_region_end})
116:               # And insert the blocks of chromosome X

117:               all_assembly_blocks = Array.new #both for chr X and Y

118:               # First do all exceptions between the first and last block

119:               previous_block = nil
120:               source_assembly_blocks.sort_by{|b| b.start}.each do |b|
121:                 if previous_block.nil?
122:                   all_assembly_blocks.push(b)
123:                   previous_block = b
124:                   next
125:                 end
126:                 # Find the exception record

127:                 exception = nil
128:                 assembly_exceptions.each do |ae|
129:                   if ae.seq_region_end == b.start - 1
130:                     exception = ae
131:                     break
132:                   end
133:                 end
134: 
135:                 new_slice_start = exception.exc_seq_region_start + ( previous_block.stop - exception.seq_region_start )
136:                 new_slice_stop = exception.exc_seq_region_start + ( b.start - exception.seq_region_start )
137:                 new_slice_strand = self.strand * exception.ori
138:                 new_slice = Slice.fetch_by_region(self.seq_region.coord_system.name, SeqRegion.find(exception.exc_seq_region_id).name, new_slice_start, new_slice_stop, new_slice_strand)
139: 
140:                 all_assembly_blocks.push(new_slice)
141:                 all_assembly_blocks.push(b)
142:                 previous_block = b
143:               end
144: 
145:               # And then see if we have to add an additional one at the start or end

146:               first_block = source_assembly_blocks.sort_by{|b| b.start}[0]
147:               if first_block.start > self.start
148:                 exception = assembly_exceptions.sort_by{|ae| ae.seq_region_start}[0]
149:                 new_slice_start = exception.exc_seq_region_start + ( self.start - exception.seq_region_start )
150:                 new_slice_stop = exception.exc_seq_region_start + ( first_block.start - 1 - exception.seq_region_start )
151:                 new_slice_strand = self.strand * exception.ori
152:                 new_slice = Slice.fetch_by_region(self.seq_region.coord_system.name, SeqRegion.find(exception.exc_seq_region_id).name, new_slice_start, new_slice_stop, new_slice_strand)
153: 
154:                 all_assembly_blocks.unshift(new_slice)
155:               end
156: 
157:               last_block = source_assembly_blocks.sort_by{|b| b.start}[-1]
158:               if last_block.stop < self.stop
159:                 exception = assembly_exceptions.sort_by{|ae| ae.seq_region_start}[-1]
160:                 new_slice_start = exception.exc_seq_region_start + ( last_block.stop + 1 - exception.seq_region_start )
161:                 new_slice_stop = exception.exc_seq_region_start + ( self.stop - exception.seq_region_start )
162:                 new_slice_strand = self.strand * exception.ori
163:                 new_slice = Slice.fetch_by_region(self.seq_region.coord_system.name, SeqRegion.find(exception.exc_seq_region_id).name, new_slice_start, new_slice_stop, new_slice_strand)
164: 
165:                 all_assembly_blocks.shift(new_slice)
166:               end
167: 
168:               answer = Array.new
169:               all_assembly_blocks.each do |b|
170:                 answer.push(b.project(coord_system_name))
171:               end
172:               answer.flatten!
173: 
174:               return answer
175:             end
176: 
177:           end
178:           # END OF ASSEMBLY_EXCEPTIONS

179: 
180:           # Get all AssemblyLinks starting from this assembly and for which

181:           # the cmp_seq_region.coord_system is what we want.

182:           assembly_links = self.seq_region.assembly_links_as_assembly(coord_system_name)
183: 
184:           # Now reject all the components that lie _before_ the source, then

185:           # reject all the components that lie _after_ the source.

186:           # Then sort based on their positions.

187:           sorted_overlapping_assembly_links = assembly_links.reject{|al| al.asm_end < self.start}.reject{|al| al.asm_start > self.stop}.sort_by{|al| al.asm_start}
188:           if sorted_overlapping_assembly_links.length == 0
189:             return []
190:           end
191: 
192:           # What we'll do, is create slices for all the underlying components,

193:           # including the first and the last one. At first, the first and last

194:           # components are added in their entirity and will only be cropped afterwards.

195:           previous_stop = nil
196:           sorted_overlapping_assembly_links.each_index do |i|
197:             this_link = sorted_overlapping_assembly_links[i]
198:             if i == 0
199:               answer.push(Slice.new(this_link.cmp_seq_region, this_link.cmp_start, this_link.cmp_end, this_link.ori))
200:                 next
201:             end
202:             previous_link = sorted_overlapping_assembly_links[i-1]
203: 
204:             # If there is a gap with the previous link: add a gap

205:             if this_link.asm_start > ( previous_link.asm_end + 1 )
206:               gap_size = this_link.asm_start - previous_link.asm_end - 1
207:               answer.push(Gap.new(CoordSystem.find_by_name(coord_system_name), gap_size))
208:             end
209: 
210:             # And add the component itself as a Slice

211:             answer.push(Slice.new(this_link.cmp_seq_region, this_link.cmp_start, this_link.cmp_end, this_link.ori))
212:           end
213: 
214:           # Now see if we have to crop the first and/or last slice

215:           first_link = sorted_overlapping_assembly_links[0]
216:           if self.start > first_link.asm_start
217:             if first_link.ori == -1
218:               answer[0].stop = first_link.cmp_start + ( first_link.asm_end - self.start )
219:             else
220:               answer[0].start = first_link.cmp_start + ( self.start - first_link.asm_start )
221:             end
222:           end
223: 
224:           last_link = sorted_overlapping_assembly_links[-1]
225:           if self.stop < last_link.asm_end
226:             if last_link.ori == -1
227:               answer[-1].start = last_link.cmp_start + ( last_link.asm_end - self.stop)
228:             else
229:               answer[-1].stop = last_link.cmp_start + ( self.stop - last_link.asm_start )
230:             end
231:           end
232: 
233:           # And check if we have to add Ns at the front and/or back

234:           if self.start < first_link.asm_start
235:             gap_size = first_link.asm_start - self.start
236:             answer.unshift(Gap.new(CoordSystem.find_by_name(coord_system_name), gap_size))
237:           end
238:           if self.stop > last_link.asm_end
239:             gap_size = self.stop - last_link.asm_end
240:             answer.push(Gap.new(CoordSystem.find_by_name(coord_system_name), gap_size))
241:           end
242:         end
243:         return answer
244: 
245:       end

DESCRIPTION

Get all ProteinAlignFeatures that are located on a Slice for a given Analysis.

Pitfall: just looks at the CoordSystem that the Slice is located on. For example, if a Slice is located on a SeqRegion on the ‘chromosome’ CoordSystem, but all protein_align_features are annotated on SeqRegions of the ‘scaffold’ CoordSystem, this method will return an empty array.

USAGE

 my_slice.protein_align_features('Uniprot').each do |feature|
   puts feature.to_yaml
 end

Arguments:

  • code: name of analysis
Returns:array of ProteinAlignFeature objects

[Source]

     # File lib/ensembl/core/slice.rb, line 592
592:       def protein_align_features(analysis_name)
593:         if analysis_name.nil?
594:           return ProteinAlignFeature.find_by_sql('SELECT * FROM protein_align_feature WHERE seq_region_id = ' + self.seq_region.id.to_s + ' AND seq_region_start >= ' + self.start.to_s + ' AND seq_region_end <= ' + self.stop.to_s)
595:         else
596:           analysis = Analysis.find_by_logic_name(analysis_name)
597:           return ProteinAlignFeature.find_by_sql('SELECT * FROM protein_align_feature WHERE seq_region_id = ' + self.seq_region.id.to_s + ' AND seq_region_start >= ' + self.start.to_s + ' AND seq_region_end <= ' + self.stop.to_s + ' AND analysis_id = ' + analysis.id.to_s)
598:         end
599:       end

[Source]

     # File lib/ensembl/core/slice.rb, line 359
359:       def repeatmasked_seq
360:         raise NotImplementedError
361:       end

DESCRIPTION

Get the sequence of the Slice as a Bio::Sequence::NA object.

If the Slice is on a CoordSystem that is not seq_level, it will try to project it coordinates to the CoordSystem that does. At this moment, this is only done if there is a direct link between the two coordinate systems. (The perl API allows for following an indirect link as well.)

Caution: Bio::Sequence::NA makes the sequence downcase!!

USAGE

 my_slice.seq.seq.to_s

Arguments:none
Returns:Bio::Sequence::NA object

[Source]

     # File lib/ensembl/core/slice.rb, line 324
324:       def seq
325:         # If we already accessed the sequence, we can just

326:         # call the instance variable. Otherwise, we'll have

327:         # to get the sequence first and create a Bio::Sequence::NA

328:         # object.

329:         if @seq.nil?
330:           # First check if the slice is on the seqlevel coordinate

331:           # system, otherwise project coordinates.

332:           if self.seq_region.coord_system.seqlevel?
333:             @seq = Bio::Sequence::NA.new(self.seq_region.subseq(self.start, self.stop))
334:           else # we have to project coordinates

335:             seq_string = String.new
336:             @target_slices = self.project('seqlevel')
337: 
338:             @target_slices.each do |component|
339:               if component.class == Slice
340:                 seq_string += component.seq # This fetches the seq recursively (see 10 lines up)

341:               else # it's a Gap

342:                 seq_string += 'N' * (component.length)
343:               end
344: 
345:             end
346:             @seq = Bio::Sequence::NA.new(seq_string)
347: 
348:           end
349: 
350:           if self.strand == -1
351:             @seq.reverse_complement!
352:           end
353: 
354:         end
355:         return @seq
356: 
357:       end

DESCRIPTION

Creates overlapping subslices for a given Slice.

USAGE

 my_slice.split(50000, 250).each do |sub_slice|
   puts sub_slice.display_name
 end

Arguments:

  • max_size: maximal size of subslices (default: 100000)
  • overlap: overlap in bp between consecutive subslices (default: 0)
Returns:array of Ensembl::Core::Slice objects

[Source]

     # File lib/ensembl/core/slice.rb, line 391
391:       def split(max_size = 100000, overlap = 0)
392:         sub_slices = Array.new
393:         i = 0
394:         self.start.step(self.length, max_size - overlap - 1) do |i|
395:           sub_slices.push(self.sub_slice(i, i + max_size - 1))
396:         end
397:         i -= (overlap + 1)
398:         sub_slices.push(self.sub_slice(i + max_size))
399:         return sub_slices
400:       end

DESCRIPTION

Take a sub_slice from an existing one.

USAGE

 my_sub_slice = my_slice.sub_slice(400,500)

Arguments:

  • start: start of subslice relative to slice (default: start of slice)
  • stop: stop of subslice relative to slice (default: stop of slice)
Returns:Ensembl::Core::Slice object

[Source]

     # File lib/ensembl/core/slice.rb, line 374
374:       def sub_slice(start = self.start, stop = self.stop)
375:         return self.class.new(self.seq_region, start, stop, self.strand)
376:       end
to_s()

Alias for display_name

DESCRIPTION

The Slice#within? method checks if this slice is contained withing another one. The other slice has to be on the same coordinate system

USAGE

 slice_a = Slice.fetch_by_region('chromosome','X',1,1000)
 slice_b = Slice.fetch_by_region('chromosome','X',900,950)
 if slice_b.overlaps?(slice_a)
   puts "Slice b is within slice a"
 end

Arguments:another slice
Returns:true or false

[Source]

     # File lib/ensembl/core/slice.rb, line 240
240:       def within?(other_slice)
241:         if ! other_slice.class == Slice
242:           raise RuntimeError, "The Slice#overlaps? method takes a Slice object as its arguments."
243:         end
244:         if self.seq_region.coord_system != other_slice.seq_region.coord_system
245:           raise RuntimeError, "The argument slice of Slice#overlaps? has to be in the same coordinate system, but were " + self.seq_region.coord_system.name + " and " + other_slice.seq_region.coord_system.name
246:         end
247: 
248:         self_range = self.start .. self.stop
249:         other_range = other_slice.start .. other_slice.stop
250: 
251:         if other_range.include?(self.start) and other_range.include?(self.stop)
252:           return true
253:         else
254:           return false
255:         end
256:       end

[Validate]