/mobile Handheld Friendly website

 performance measurements

Each table row shows performance measurements for this Lua program with a particular command-line input value N.

 N  CPU secs Elapsed secs Memory KB Code B ≈ CPU Load
50,0000.200.22276669  0% 0% 9% 95%
500,0001.981.9928,340669  1% 0% 1% 100%
5,000,00019.6919.71224,592669  0% 0% 0% 100%

Read the ↓ make, command line, and program output logs to see how this program was run.

Read regex-dna benchmark to see what this program should do.

 notes

Lua 5.2.1 Copyright (C) 1994-2012 Lua.org, PUC-Rio

Don't split pattern at |

 regex-dna Lua #3 program source code

-- The Computer Language Benchmarks Game

-- http://benchmarksgame.alioth.debian.org/

-- contributed by Jim Roseborough

-- modified by Victor Tang

-- optimized & replaced inefficient use of gsub with gmatch

-- partitioned sequence to prevent extraneous redundant string copy

-- modified to use Lpeg's re module for matching variants


re = require 're'
seq = io.read("*a")
ilen, seq = #seq, re.gsub(seq, '">"[^%c]*%c*', ''):gsub('%c+', '')
clen = #seq

local variants = { 'agggtaaa|tttaccct',
                   '[cgt]gggtaaa|tttaccc[acg]',
                   'a[act]ggtaaa|tttacc[agt]t',
                   'ag[act]gtaaa|tttac[agt]ct',
                   'agg[act]taaa|ttta[agt]cct',
                   'aggg[acg]aaa|ttt[cgt]ccct',
                   'agggt[cgt]aa|tt[acg]accct',
                   'agggta[cgt]a|t[acg]taccct',
                   'agggtaa[cgt]|[acg]ttaccct', }

local subst = { B='(c|g|t)', D='(a|g|t)',   H='(a|c|t)', K='(g|t)',
                M='(a|c)',   N='(a|c|g|t)', R='(a|g)',   S='(c|g)',
                V='(a|c|g)', W='(a|t)',     Y='(c|t)' }

function retolpeg(pat)
  pat = re.gsub(pat, "!'['{%w+}!']'", "'%1'")
  pat = re.gsub(pat, "'|'", "/")
  return "({"..pat.."}/.)*"
end

function countmatches(variant)
   local t = { re.match(seq, retolpeg(variant)) }
   return type(t[1]) == 'number' and 0 or #t
end

for _, p in ipairs(variants) do
   io.write( string.format('%s %d\n', p, countmatches(p)) )
end

function partitionstring(seq)
  local seg = math.floor( math.sqrt(#seq) )
  local seqtable = {}
  for nextstart = 1, #seq, seg do
    table.insert(seqtable, seq:sub(nextstart, nextstart + seg - 1))
  end
  return seqtable
end
function chunk_gsub(t, k, v)
  for i, p in ipairs(t) do
    t[i] = p:find(k) and p:gsub(k, v) or t[i]
  end
  return t
end

seq = partitionstring(seq)
for k, v in pairs(subst) do
  chunk_gsub(seq, k, v)
end
seq = table.concat(seq)
io.write(string.format('\n%d\n%d\n%d\n', ilen, clen, #seq))

 make, command-line, and program output logs

Tue, 06 Aug 2013 18:36:21 GMT

COMMAND LINE:
/usr/local/src/lua-5.2.2/install/bin/lua  regexdna.lua-3.lua 0 < regexdna-input5000000.txt

PROGRAM OUTPUT:
agggtaaa|tttaccct 356
[cgt]gggtaaa|tttaccc[acg] 1250
a[act]ggtaaa|tttacc[agt]t 4252
ag[act]gtaaa|tttac[agt]ct 2894
agg[act]taaa|ttta[agt]cct 5435
aggg[acg]aaa|ttt[cgt]ccct 1537
agggt[cgt]aa|tt[acg]accct 1431
agggta[cgt]a|t[acg]taccct 1608
agggtaa[cgt]|[acg]ttaccct 2178

50833411
50000000
66800214

Revised BSD license

  Home   Conclusions   License   Play