The Computer Language
Benchmarks Game

regex-dna Ruby JRuby #7 program

source code

# The Computer Language Benchmarks Game
# http://benchmarksgame.alioth.debian.org
#
# contributed by jose fco. gonzalez
# optimized & parallelized by Rick Branson
# further optimised by Scott Leggett

seq = STDIN.readlines.join
ilen = seq.size

seq.gsub!(/>.*\n|\n/,"")
clen = seq.length

MATCHERS = [
  /agggtaaa|tttaccct/,
  /[cgt]gggtaaa|tttaccc[acg]/,
  /a[act]ggtaaa|tttacc[agt]t/,
  /ag[act]gtaaa|tttac[agt]ct/,
  /agg[act]taaa|ttta[agt]cct/,
  /aggg[acg]aaa|ttt[cgt]ccct/,
  /agggt[cgt]aa|tt[acg]accct/,
  /agggta[cgt]a|t[acg]taccct/,
  /agggtaa[cgt]|[acg]ttaccct/
]

if RUBY_PLATFORM == "java"
  threads = MATCHERS.map do |f|
    Thread.new do
      Thread.current[:result] = "#{f.source} #{seq.scan(f).size}"
    end
  end

  threads.each do |t|
    t.join
  end

  threads.each do |t|
    puts t[:result]
  end
else
  children = MATCHERS.map do |f|
    r, w = IO.pipe
    p = Process.fork do
      r.close
      w.write "#{f.source} #{seq.scan(f).size}"
      w.close
    end
  
    w.close
    [p, r, w]
  end

  children.each do |p, r, w|
    puts r.read
    r.close
  end

  Process.waitall
end

seq.gsub!(/[BDHKMNRSVWY]/, {
    'B' => '(c|g|t)', 'D' => '(a|g|t)', 'H' => '(a|c|t)', 'K' => '(g|t)',
    'M' => '(a|c)', 'N' => '(a|c|g|t)', 'R' => '(a|g)', 'S' => '(c|t)',
    'V' => '(a|c|g)', 'W' => '(a|t)', 'Y' => '(c|t)'
})

puts
puts ilen
puts clen
puts seq.length
    

notes, command-line, and program output

NOTES:
32-bit Ubuntu one core
jruby 9.1.0.0 (2.3.0) 2016-05-02 a633c63 Java HotSpot(TM) Server VM 25.92-b14 on 1.8.0_92-b14 +jit [linux-i386]



Wed, 04 May 2016 05:37:31 GMT

MAKE:
mv regexdna.jruby-7.jruby regexdna.rb
0.01s to complete and log all make actions

COMMAND LINE:
/usr/local/src/jruby-9.1.0.0/bin/jruby -Xcompile.fastest=true -Xcompile.invokedynamic=true -J-server -J-Xmn512m -J-Xms2048m -J-Xmx2048m regexdna.rb 0 < regexdna-input5000000.txt

PROGRAM OUTPUT:
agggtaaa|tttaccct 356
[cgt]gggtaaa|tttaccc[acg] 1250
a[act]ggtaaa|tttacc[agt]t 4252
ag[act]gtaaa|tttac[agt]ct 2894
agg[act]taaa|ttta[agt]cct 5435
aggg[acg]aaa|ttt[cgt]ccct 1537
agggt[cgt]aa|tt[acg]accct 1431
agggta[cgt]a|t[acg]taccct 1608
agggtaa[cgt]|[acg]ttaccct 2178

50833411
50000000
66800214