Well, one of my posts caused quite a bit of attention so I am revisiting the subject. I’ll start by explaining what the idea is:
I have a text file containing numbers, a random amount of numbers per line and the numbers are of random size. Can you quickly give me a sum of all numbers? The file has to be read from stdin.
You can find the example file I used here.
I made a solution in Python, using generator comprehensions. However, a battle ensued in the comments to create the quickest version of this program and after a few comments about 30% was shaven off the time my Python program took. Then things took a turn, comments started dripping in of other languages. Haskell, PHP, Ruby and C amongst others were submitted.
Thus I have decided to create a little posting about this specific problem. Firstly I have compiled and run all the different versions and languages on my hardware and timed then. I ran every tool 10 times and averaged the result which resulted in this graph:
Source codes are here, however, if you think you can do any better than this please submit a link to a pastebin in the comments with some explanation and I’ll update this post and graph. I’d like for funky languages to be included as well, as long as I can compile it on my Debian server!
Results (scroll down for credits and source code for each language, think you can do better or have other languages? comment!):

- PHP: 2.53 s
- x86 asm: 0.04 s
- Ruby 1.9.1: 3.59 s
- C: 0.04 s
- Python: 2.18 s
- Haskell: 0.05 s
- gawk: 1.61 s
- Perl: 4.74 s
- Vala: 2.01 s
I also ran the fastest three on a large input file. Haskell fails, giving the wrong number.

- C: 2.91 s
- x86 asm: 4.39 s
Clearly, C and Haskell reign supreme even knowing more about the processor instructions then RayNbow’s x86 solution. Python is fastest at the interpreted languages. I ran the Ruby sample on Ruby 1.8 as well which took 20ish seconds, Ruby is surely stepping up it’s game, speedwise, still lacking a bit behind PHP though.
C (luite) (gcc -O2):
#include <stdio.h>
#include <stdint.h>
int main(int argc, char** argv) {
uint64_t sum=0;
char buf[8192];
int cur=0;
int i,j;
while((i=fread(buf,1,8192,stdin))>0) {
for(j=0;j<i;j++) {
char c = buf[j];
if(c >= '0' && c <= '9') {
cur = (cur*10)+c-48;
} else {
sum+=cur;
cur=0;
}
}
}
sum+=cur;
printf("total: %llu\n", sum); /* what is the correct format type for uint64_t? */
}
Haskell (Don Stewart):
{-# LANGUAGE BangPatterns #-}
--
-- The Computer Language Shootout
-- http://shootout.alioth.debian.org/
--
-- A lazy bytestring solution
--
import qualified Data.ByteString.Char8 as S
main = do
putStr "Grand total "
print . go 0 =<< S.getContents
where
go !n !s = case S.readInt s of -- lazily reads current line
Nothing -> n
Just (k,t) -> go (n+k) (S.tail t)
{-
-- Important: turn on optimizations:
$ ghc -O2 --make A.hs
[1 of 1] Compiling Main ( A.hs, A.o )
Linking A ...
$ cksum input_list_vs_gen.txt
2913999841 17293232 input_list_vs_gen.txt
$ du -hs input_list_vs_gen.txt
17M input_list_vs_gen.txt
$ time ./A < input_list_vs_gen.txt
Grand total 17677692470
./A < input_list_vs_gen.txt 0.19s user 0.06s system 98% cpu 0.254 total
-}
PHP (John Piasetzki):
while($line = fgets(STDIN))
$s += array_sum(explode(' ',$line));
echo $s;
Python (Pythoneer):
#!/usr/bin/env/python import sys from itertools import imap print sum(sum(imap(int, line.split())) for line in sys.stdin)
Ruby (Thomas Hurst):
#!/usr/bin/env ruby
total = 0
STDIN.each_line {|line| line.split(" ").each {|x| total += x.to_i } }
puts "Grand total: #{total}"
x86 asm (James Block):
; for NASM, a 386 or better, and a 2.6-series Linux kernel ; it should run fine on a 64-bit system with appropriate flags for NASM ; assemble with nasm -f elf FILENAME.asm && ld FILENAME.o -o OUTPUTFILENAME ; run as ./OUTPUTFILENAME < INPUT_NUMBERS.txt ; expect weirdness with nonconforming input. this is pure assembly, you know ; each input number should be at most 32 bits, and the sum at most 64 bits buffersize equ (1<<16) ;use 64kB chunks; can be set to any value ;64kB is fastest on this old Willamette P4 section .data message: db "Grand total:" printbuffer: times 23 db 0x20 dw 0x0A msglen equ $ - message section .text global _start _start: ;we want to read a bunch of numbers from STDIN ;we'll do this by allocating memory with mmap() ;and then reading stdin with read() ; ;mmap signature: ; void* mmap(void* start, size_t length, int prot, int flags, ; int fd, off_t offset); ; (defined in <sys/mman.h>) ; ;arguments: ; start - pass 0, usually ignored ; length - pass $BIGNUM to allocate a $BIGNUM chunk ; prot - pass PROT_READ | PROT_WRITE (0x01|0x02) ; flags - pass MAP_PRIVATE (0x02) | MAP_ANONYMOUS (0x20) ; fd - pass anything (-1) ; offset - pass anything ; ; linux calling convention: 6-arg syscalls use the stack ; push arguments in REVERSE order ; ; system call number goes in eax: ; exit: 1 ; read: 3 ; write: 4 ; mmap: 90 ; munmap: 91 mov esi, buffersize ;store buffer size in esi push dword 0 ;offset push dword -1 ;fd push dword 0x22 ;flags - MAP_PRIVATE|MAP_ANONYMOUS push dword 0x03 ;prot - PROT_READ|PROT_WRITE push dword esi ;allocation size push dword 0 ;start mov eax, 90 mov ebx, esp int 80h add esp, (4*6) ;clean up the refuse from calling mmap() ;now eax holds the start address for our memory! yay! ;let's put it in ebp instead, since ebp isn't good for much else mov ebp, eax ;now let's read() in the data from stdin ;read() signature: ; ssize_t read(int fd, void* buf, size_t count) ;3-arg syscalls pass arguments in registers, so ;this one's much easier than mmap(): mov ecx, eax mov ebx, 0 mov eax, 3 mov edx, esi int 80h ;accumulate in the traditional pairing edx:eax ;let ecx index our position in the memory block ;read digits into ebx, accumulate into numbers in edi ; then sum edi to the main accumulator, edx:eax ; anything but a digit ends a number ; zero byte ends all counting xor edx, edx xor eax, eax xor ecx, ecx dec ecx readdigit: xor edi, edi readdigit_lp: xor ebx, ebx inc ecx cmp ecx, esi je readmore gotmore:mov bl, byte [ebp+ecx] test bl, bl jz finished_summing ;exit on zero byte sub bl, 48 js next_number cmp bl, 9 ja next_number ;if this is not a digit, skip it and sum shl edi, 1 lea edi, [edi*4 + edi] add edi, ebx jmp readdigit_lp next_number: add eax, edi jnc readdigit inc edx jmp readdigit readmore: mov esp, eax mov eax, 3 xor ebx, ebx mov ecx, ebp xchg edx, esi int 80h ;if we didn't fill the whole buffer, stick a zero byte after the ; bit we *did* read, to ensure proper termination of the loop cmp eax, edx je readcleanup mov byte [ebp+eax+1], 00 readcleanup: xor ecx, ecx xchg edx, esi mov eax, esp jmp gotmore finished_summing: ;edx:eax contains the 64-bit sum ;now we have to print it ;the following is an EXTREMELY ugly printf(%llu) substitute mov ecx, (printbuffer+22) mov ebp, 10 mov esp, 429496729 printlp64: ;if we can bail on this horrid 64-bit print loop ; and use a much saner 32-bit version, ; then, damn it, EXIT NOW test edx, edx jz printlp32 ;algorithm (64-bit divmod): let W = 2**32, Z = X + Y*W ; (i.e., X is eax and Y is edx) ; then let M = 10 (modulus) and define ; X divmod M = (x_D, x_M) (the definition of divmod: ; Y divmod M = (y_D, y_M) (quotient, remainder)) ; W divmod M = (W_D, W_M) ; ; let u = x_M + y_M * W_M and calculate ; u divmod M = (u_D, u_M) ; ; then the following is true: ; Z divmod M = (y_D * W + x_D + y_M*W_D + u_D, u_M) ; ; the following pile of junk implements the above ; (and allows for overflow in a couple places) mov edi, eax mov eax, edx xor edx, edx div ebp lea ebx, [edx*3] shl ebx, 1 mov esi, eax mov eax, edx mul esp ;edx:eax has y_M*W_D, ebx has y_M*W_M, edi has x, esi has y_D add esi, edx xor edx, edx xchg eax, edi div ebp ;eax has x_D, edx has x_M, esi has y_D, edi has y_M*W_D add ebx, edx ;ebx has u add edi, eax xor edx, edx mov eax, ebx div ebp ;eax has u_D, edx has u_M add dl, 0x30 mov byte [ecx], dl dec ecx mov edx, esi add eax, edi jnc printlp64 inc edx jmp printlp64 printlp32: div ebp add dl, 0x30 mov byte [ecx], dl dec ecx xor edx, edx test eax, eax jnz printlp32 EXIT: mov eax, 4 mov ebx, 1 mov ecx, message mov edx, msglen int 80h mov eax, 1 mov ebx, 0 int 80h
Perl (Michael Pyne):
#!/usr/bin/env perl
use strict;
my $sum = 0;
while(<STDIN>) {
my @nums = split(' ');
foreach my $num (@nums) { $sum += $num; }
}
print "$sum\n";
Gawk (Eric Wong):
gawk 'END { printf "%d\n", n } { for (i=NF;i>0;--i) n += $i}'
Vala (snuxoll):
using GLib;
namespace snuxoll {
class StdinReader : Object {
public StdinReader () {
}
public bool eof() {
return stdin.eof();
}
public string? read_line() {
var line = new StringBuilder();
var buffer = new char[1024];
while ( ( !line.str.has_suffix ("\n") ) || this.eof() ) {
string chunk = stdin.gets(buffer);
if (chunk != null) {
line.append(chunk);
} else {
return null;
}
}
return line.str;
}
}
}
// start other file
using GLib;
using snuxoll;
class SumApplication : Object {
private uint64 total { get; set; }
public SumApplication() {
this.total = 0;
}
public void run() {
var reader = new StdinReader();
while ( !stdin.eof() ) {
var line = reader.read_line();
if ( line != null ) {
var nums = line.strip().split(" ");
foreach(string num in nums) {
this.total += num.to_int();
}
}
}
stdout.printf("Grand Total: %llu\n", total);
}
static int main(string[] args) {
var app = new SumApplication();
app.run();
return 0;
}
}