The Computer Language
Benchmarks Game

k-nucleotide Ruby JRuby #5 program

source code

# The Computer Language Benchmarks Game
# http://benchmarksgame.alioth.debian.org
#
# contributed by Aaron Tavistock

def find_frequencies(keys)
  counts = Hash.new(0)
  threads = []
  keys.each do |key|
    threads << Thread.new do
      key_string = key.to_s.freeze
      last_index = 0
      while last_index = @seq.index(key_string, last_index+1)
        counts[key] += 1
      end
    end
  end
  threads.each(&:join)
  counts
end

def frequency(keys)
  @frequencies.select { |k,_| keys.include?(k) }
end

def percentage(keys)
  frequency(keys).sort { |a,b| b[1] <=> a[1] }.map do |key, value|
    "%s %.3f" % [ key.upcase, ( (value*100).to_f / @seq.size) ]
  end
end

def count(keys)
  frequency(keys).sort_by { |a| a[0].size }.map do |key, value|
    "#{value.to_s}\t#{key.upcase}"
  end
end

def load_sequence(marker)
  input = STDIN.read
  start_idx = input.index(marker) + marker.size
  seq = input[start_idx, input.size - 1]
  seq.delete!(' ')
  seq.delete!("\n")
  seq.freeze
  seq
end

singles = %i(a t c g)
doubles = %i(aa at ac ag ta tt tc tg ca ct cc cg ga gt gc gg)

# count ALL the 3- 4- 6- 12- and 18-nucleotide sequences
#chains  = %i(ggt ggta ggtatt ggtattttaatt ggtattttaatttatagt)

@seq = load_sequence('>THREE Homo sapiens frequency')
@frequencies = find_frequencies(singles + doubles + chains)

print "#{percentage(singles).join("\n")}\n\n"
print "#{percentage(doubles).join("\n")}\n\n"
print "#{count(chains).join("\n")}\n"
    

notes, command-line, and program output

NOTES:
32-bit Ubuntu one core
jruby 9.1.0.0 (2.3.0) 2016-05-02 a633c63 Java HotSpot(TM) Server VM 25.92-b14 on 1.8.0_92-b14 +jit [linux-i386]



Thu, 22 Sep 2016 17:41:14 GMT

MAKE:
mv knucleotide.jruby-5.jruby knucleotide.rb
0.01s to complete and log all make actions

COMMAND LINE:
/usr/local/src/jruby-9.1.0.0/bin/jruby -Xcompile.fastest=true -Xcompile.invokedynamic=true -J-server -J-Xmn512m -J-Xms2048m -J-Xmx2048m knucleotide.rb 0 < knucleotide-input250000.txt

PROGRAM FAILED 


PROGRAM OUTPUT:

NameError: undefined local variable or method `chains' for main:Object
  <top> at knucleotide.rb:55