Human DNA is full of repetitive elements
a pipe-based I/O stream.
name_sequence.pl < new.dna |
quality_check.pl |
vector_check.pl |
find_repeats.pl |
search_big_database.pl |
load_lab_database.pl
(A file containing the new DNA sequence is processed by a perl script named "name_sequence.pl. The output from name_sequence.pl is next passed to the quality checking program, which looks for the SEQUENCE tag, runs the quality checking algorithm, and writes its conclusion to the data stream. Now the data stream enters the vector checker. It pulls the SEQUENCE tag out of the stream and runs the vector checking algorithm. This continues down the pipeline, until at last the "load_lab_database.pl" script collates all the data collected, makes some final conclusions about whether the sequence is suitable for further use, and enters all the results into the laboratory database)
----------
use Boulder::Stream;
$stream = new Boulder::Stream;
while ($record = $stream->read_record('NAME','SEQUENCE')) {
$name = $record->get('NAME');
$sequence = $record->get('SEQUENCE');
...[continue processing]...
$record->add(QUALITY_CHECK=>"OK");
$stream->write_record($record);
($name,$start_position,$length) = split("\t");
#(put file into an array and used the answer in a foreach loop)
----------
#Substitution at the start and to be replaced with nothing
$row =~ s/(\d+)(??{"."*$1})//xg;
$row =~ s/[+-](\d+)(??{"[ACGTN]{$1}"})//gi;
print($row, "\n");
use feature qw(say);
my $DNA = ',...........,,....,,g.,,,,,,,,,,,.+12GATGCTGTGTTT..,,,,,.,,.,,-8tgatgctg,,,,,,,,..';
say $DNA;
$DNA =~ s/\d+[ATGCatgc]*//g;
say $DNA;
(say is Just like print, but implicitly appends a newline. say LIST is simply an abbreviation for { local $\ = "\n"; print LIST }
----------
The hidden .panfs files are a result of having a mounted directory. When they are present they are "open" and can't be removed. I have found that rebooting or unmounting removes these files, and then you can delete the files
rm -rf dir (works even without root)
----------
?: creates a phantom group, a non-capturing group
(A) (?:B) (C)
We'll have 2 groups: $1 = A and $2 = C
----------
while (<DATA>) {
chomp;
say;
}
__DATA__
line1
line2
#Gives the length of the line i.e no. of words
my @list1 = qw (The quick brown fox jumps over the lazy Perl programmer);
my @list1 = qw (a d c x y m n adc dbe );
print length(@list1), "\n";
print length(@list2), "\n";
#(A 10-element array length(@list) returns 2 (length(10) == 2); A 9-element array length(@list1) returns 1 (length(9) == 1))
----------
a pipe-based I/O stream.
name_sequence.pl < new.dna |
quality_check.pl |
vector_check.pl |
find_repeats.pl |
search_big_database.pl |
load_lab_database.pl
(A file containing the new DNA sequence is processed by a perl script named "name_sequence.pl. The output from name_sequence.pl is next passed to the quality checking program, which looks for the SEQUENCE tag, runs the quality checking algorithm, and writes its conclusion to the data stream. Now the data stream enters the vector checker. It pulls the SEQUENCE tag out of the stream and runs the vector checking algorithm. This continues down the pipeline, until at last the "load_lab_database.pl" script collates all the data collected, makes some final conclusions about whether the sequence is suitable for further use, and enters all the results into the laboratory database)
----------
use Boulder::Stream;
$stream = new Boulder::Stream;
while ($record = $stream->read_record('NAME','SEQUENCE')) {
$name = $record->get('NAME');
$sequence = $record->get('SEQUENCE');
...[continue processing]...
$record->add(QUALITY_CHECK=>"OK");
$stream->write_record($record);
($name,$start_position,$length) = split("\t");
#(put file into an array and used the answer in a foreach loop)
----------
#Substitution at the start and to be replaced with nothing
$row =~ s/(\d+)(??{"."*$1})//xg;
$row =~ s/[+-](\d+)(??{"[ACGTN]{$1}"})//gi;
print($row, "\n");
use feature qw(say);
my $DNA = ',...........,,....,,g.,,,,,,,,,,,.+12GATGCTGTGTTT..,,,,,.,,.,,-8tgatgctg,,,,,,,,..';
say $DNA;
$DNA =~ s/\d+[ATGCatgc]*//g;
say $DNA;
(say is Just like print, but implicitly appends a newline. say LIST is simply an abbreviation for { local $\ = "\n"; print LIST }
----------
The hidden .panfs files are a result of having a mounted directory. When they are present they are "open" and can't be removed. I have found that rebooting or unmounting removes these files, and then you can delete the files
rm -rf dir (works even without root)
----------
?: creates a phantom group, a non-capturing group
(A) (?:B) (C)
We'll have 2 groups: $1 = A and $2 = C
----------
while (<DATA>) {
chomp;
say;
}
__DATA__
line1
line2
#Gives the length of the line i.e no. of words
my @list1 = qw (The quick brown fox jumps over the lazy Perl programmer);
my @list1 = qw (a d c x y m n adc dbe );
print length(@list1), "\n";
print length(@list2), "\n";
#(A 10-element array length(@list) returns 2 (length(10) == 2); A 9-element array length(@list1) returns 1 (length(9) == 1))
----------
No comments:
Post a Comment