Abhay Mujumdar [Mon, 20 Aug 2012 15:31:37 +0000 (08:31 -0700)]
Merge pull request #17 from robinluckey/master
Catching up with community contributions
Robin Luckey [Mon, 20 Aug 2012 15:20:22 +0000 (08:20 -0700)]
Merge pull request #15 from arcriley/master
Add support for Genie and Vala .vapi files
Robin Luckey [Mon, 20 Aug 2012 15:19:34 +0000 (08:19 -0700)]
Merge pull request #14 from sylvestre/master
The txx extension can be used for C++ templates
Robin Luckey [Mon, 20 Aug 2012 15:18:52 +0000 (08:18 -0700)]
Merge pull request #13 from raphink/dev/mystrnlen_segfault
Prevent segfault on empty files
Arc Riley [Mon, 13 Aug 2012 17:27:42 +0000 (13:27 -0400)]
Added support for Genie and Vala .vapi files
Sylvestre Ledru [Tue, 19 Jun 2012 14:45:40 +0000 (16:45 +0200)]
The txx extension can be used for C++ templates
Raphaël Pinson [Sat, 2 Jun 2012 05:59:45 +0000 (07:59 +0200)]
Return NULL when NULL is passed to disambiguate_pp
Raphaël Pinson [Fri, 1 Jun 2012 22:02:42 +0000 (00:02 +0200)]
Prevent segfault on empty files
Robin Luckey [Fri, 1 Jun 2012 16:00:02 +0000 (09:00 -0700)]
Merge pull request #12 from raphink/puppet
Improve Puppet/Pascal disambiguation
Raphaël Pinson [Sat, 26 May 2012 18:26:45 +0000 (20:26 +0200)]
Puppet/Pascal: strncmp calls are really not helping, functionaly or timewise
Raphaël Pinson [Sat, 26 May 2012 17:51:23 +0000 (19:51 +0200)]
Puppet/Pascal: try harder to find Puppet keywords
Raphaël Pinson [Sat, 26 May 2012 11:07:36 +0000 (13:07 +0200)]
Puppet parser: improve regex matching for caret to work
Raphaël Pinson [Fri, 25 May 2012 16:30:14 +0000 (18:30 +0200)]
Recognize Puppet node definitions
Raphaël Pinson [Fri, 25 May 2012 16:13:23 +0000 (18:13 +0200)]
Avoid detecting Pascal code as Puppet
Raphaël Pinson [Fri, 25 May 2012 14:32:56 +0000 (16:32 +0200)]
Puppet parser: detect classes and defines with colons
Robin Luckey [Wed, 18 Apr 2012 22:14:39 +0000 (15:14 -0700)]
Merge branch 'master' of github.com:robinluckey/ohcount
Robin Luckey [Wed, 18 Apr 2012 22:16:59 +0000 (15:16 -0700)]
Merge pull request #10 from raphink/master
Add support for the Augeas language (based on Ocaml)
Robin Luckey [Wed, 18 Apr 2012 22:16:45 +0000 (15:16 -0700)]
Merge pull request #11 from raphink/tex
Add sty and cls for LANG_TEX, add dtx and LANG_TEX_DTX
Abhay Mujumdar [Fri, 13 Apr 2012 19:12:00 +0000 (12:12 -0700)]
Merge pull request #12 from blackducksw/libmagic
Libmagic
Raphaël Pinson [Thu, 12 Apr 2012 08:33:17 +0000 (10:33 +0200)]
Add .cls to LANG_TEX
Raphaël Pinson [Wed, 11 Apr 2012 22:02:56 +0000 (00:02 +0200)]
Make DTX a separate language derived from TeX.
Raphaël Pinson [Wed, 11 Apr 2012 21:46:16 +0000 (23:46 +0200)]
Add dtx and sty for LANG_TEX
Raphaël Pinson [Wed, 11 Apr 2012 15:50:39 +0000 (17:50 +0200)]
Add support for the Augeas language (based on Ocaml)
Robin Luckey [Mon, 9 Apr 2012 20:50:12 +0000 (13:50 -0700)]
Fixes recursion bug in disambiguate_in().
The basic strategy of disambiguate_in() is to strip the trailing *.in
extension from the filepath, and then to disambiguate the file as if it
originally had that name. Thus, given file "foo.in", disambiguate_in()
will disambiguate "foo".
disambiguate_in() achieves this while re-using the exact same file on
disk. This is possible because a SourceFile struct has both a `filepath`
(the name we use for disambiguation purposes) and the `diskpath` (the
actual name on disk).
So disambiguate_in() instantiates a new SourceFile with a stripped
filepath, yet the same diskpath and same file contents.
The bug is that the code did this incorrectly: when assigning the
diskpath of the new SourceFile, it would mistakenly assign it the
previous SourceFile's *filepath* instead of the previous SourceFile's
diskpath.
If disambiguate_in() runs just once (when the file has just a single
*.in extension, the usual case), this mistake does not matter because
the filepath and diskpath are the same.
But if disambiguate_in() recurses on itself (when the file has multiple
*.in.in extensions), then during the second pass the filepath and
diskpath will not be equal -- they will differ by one missing *.in
extension. Thus the diskpath of the new SourceFile will refer to a
(probably) non-existent file.
The bug is hard to explain but was simple to correct.
In addition to correcting the diskpath assignment, I've fixed a memory
leak: it was possible to allocate a new SourceFile, and then immediately
return NULL, which fails to free the SourceFile. I've moved the
allocation *after* the NULL return check to avoid this.
Robin Luckey [Thu, 8 Mar 2012 06:50:47 +0000 (22:50 -0800)]
Removes unused escape_path() function
Robin Luckey [Thu, 8 Mar 2012 00:10:11 +0000 (16:10 -0800)]
Use libmagic instead of spawning a process to run `file`
Robin Luckey [Tue, 6 Mar 2012 22:05:06 +0000 (14:05 -0800)]
Change README to use Github flavored Markdown
Robin Luckey [Tue, 6 Mar 2012 22:02:18 +0000 (14:02 -0800)]
README updates and corrections
Robin Luckey [Tue, 6 Mar 2012 21:46:21 +0000 (13:46 -0800)]
Merge pull request #11 from dcsobral/forth
Initial support for Forth
Robin Luckey [Tue, 6 Mar 2012 21:39:15 +0000 (13:39 -0800)]
Merge pull request #9 from haraldkl/master
Fixing Bug in Fortran disambiguation
Daniel C. Sobral [Thu, 23 Feb 2012 20:05:15 +0000 (18:05 -0200)]
Initial support for Forth
This is based on the Scala parser, which is actually quite
incorrect -- assumes existence of single-quote strings (which
will cause problem on any file with symbols), doesn't know
multiline strings, doesn't handle nested comments: all of which
made it a pretty good starting point for Forth.
Parsing Forth is impossible, but this will recognize comments,
strings and blank lines on most projects. Tested against FreeBSD
source.
Abhay Mujumdar [Mon, 13 Feb 2012 19:13:36 +0000 (11:13 -0800)]
Merge pull request #10 from blackducksw/OTWO-1300
OTWO-1300 Improves *.pl disambiguation to ignore smileys :-)
Robin Luckey [Wed, 8 Feb 2012 15:24:28 +0000 (10:24 -0500)]
OTWO-1300 Improves *.pl disambiguation to ignore smileys :-)
Smiley faces in Perl strings and comments look similar to Prolog
rule syntax. This patch makes two improvements:
- Better detection of perl shebangs (#!%PERL% now recognized)
- A prolog ':-' token must be followed by a space or a newline
Harald Klimach [Tue, 3 Jan 2012 10:45:10 +0000 (11:45 +0100)]
Update Fortran extensions to cover the list, supported by gfortran
(.FPP, .F, .FOR, .FTN, .F90, .F95, .F03 or .F08), see
http://gcc.gnu.org/onlinedocs/gfortran/Preprocessing-Options.html
Harald Klimach [Tue, 3 Jan 2012 09:27:44 +0000 (10:27 +0100)]
Changed the logic to disambiguate free and fixed formatted Fortran
Test the assumption of a fixed format code and indicate free
format, as soon as any line breaks this assumption.
(It is easier to check for fixed form constraints)
Rules for fixed format are taken from the standard, see
ftp://ftp.nag.co.uk/sc22wg5/N1801-N1850/N1830.pdf p. 47.
Harald Klimach [Mon, 2 Jan 2012 00:33:38 +0000 (01:33 +0100)]
More typical Fortran free formatted test file.
Harald Klimach [Sat, 31 Dec 2011 11:16:11 +0000 (12:16 +0100)]
Return free format in the Fortran disambiguation,
if the code is definetly not fixed.
Robin Luckey [Thu, 22 Dec 2011 16:07:09 +0000 (08:07 -0800)]
Adds unit test for escape_path()
Robin Luckey [Wed, 21 Dec 2011 17:46:35 +0000 (09:46 -0800)]
OTWO-1137 Escapes single quotes in file paths
Robin Luckey [Fri, 16 Dec 2011 19:19:19 +0000 (11:19 -0800)]
Adds additional comment styles for MS-DOS batch files
In addition to 'REM', we now accept '@REM' and '::'.
Note that test/expected_dir/bat1.bat should be tab-delimited (not
space-delimited), so this patch also corrects that.
Robin Luckey [Thu, 15 Dec 2011 23:11:20 +0000 (15:11 -0800)]
Merge branch 'master' of git://github.com/pfusik/ohcount
Robin Luckey [Thu, 15 Dec 2011 22:48:16 +0000 (14:48 -0800)]
Corrections to Logtalk unit tests
Robin Luckey [Thu, 15 Dec 2011 22:50:32 +0000 (14:50 -0800)]
Merge branch 'master' of git://github.com/pmoura/ohcount
Robin Luckey [Thu, 15 Dec 2011 22:30:59 +0000 (14:30 -0800)]
Fixes crash in disambiguate_r() when source file is empty
Thanks to ehsan for discovering this bug.
Ehsan Akhgari [Sun, 9 Oct 2011 21:06:59 +0000 (17:06 -0400)]
Enable building on Mac, which lacks the strnlen function by using memchr instead of it
Robin Luckey [Thu, 15 Dec 2011 21:51:55 +0000 (13:51 -0800)]
Merge pull request #6 from cmarcelo/qml
Add support for Qt's QML language
Robin Luckey [Thu, 15 Dec 2011 21:42:47 +0000 (13:42 -0800)]
Merge pull request #5 from koraktor/ruby
Added more filenames and extensions for Ruby
Robin Luckey [Thu, 15 Dec 2011 21:42:33 +0000 (13:42 -0800)]
Merge pull request #4 from koraktor/mustache
Treat Mustache templates as HTML
Caio Marcelo de Oliveira Filho [Sat, 22 Oct 2011 05:08:48 +0000 (02:08 -0300)]
Add support for Qt's QML language
Reusing the JS parser, since QML is 'almost' JavaScript. The
approximation is good enough for the line counting purposes.
Piotr Fusik [Mon, 29 Aug 2011 12:42:53 +0000 (14:42 +0200)]
Check if *.def files are Modula-2.
Sebastian Staudt [Thu, 11 Aug 2011 12:50:54 +0000 (14:50 +0200)]
Treat Mustache templates as HTML
Mustache introduces only a small amount of additional syntax, so treating
its templates as pure HTML shouldn't hurt.
Sebastian Staudt [Thu, 11 Aug 2011 12:47:42 +0000 (14:47 +0200)]
Added more filenames and extensions for Ruby
Paulo Moura [Tue, 9 Aug 2011 16:11:04 +0000 (11:11 -0500)]
Minor improvement for detecting Perl files.
Robin Luckey [Tue, 9 Aug 2011 15:39:34 +0000 (08:39 -0700)]
Merge branch 'ecere'
Robin Luckey [Tue, 9 Aug 2011 15:38:16 +0000 (08:38 -0700)]
Completes eC parser
- Adds parse_ec() to the list of parsers
- Adds a test to ensure that line counter works
Piotr Fusik [Thu, 15 Jul 2010 18:26:21 +0000 (20:26 +0200)]
Add file extensions "asx" and "as8" - 6502 assembler.
Robin Luckey [Mon, 8 Aug 2011 22:03:22 +0000 (15:03 -0700)]
OTWO-922 Adds CoffeeScript parser
Robin Luckey [Mon, 8 Aug 2011 20:19:06 +0000 (13:19 -0700)]
Merge https://github.com/bytbox/ohcount into jam
Robin Luckey [Mon, 8 Aug 2011 20:12:02 +0000 (13:12 -0700)]
Merge branch 'adding_racket' of https://github.com/jbclements/ohcount into racket
Conflicts:
src/hash/languages.gperf
src/hash/parsers.gperf
src/languages.h
test/unit/parser_test.h
Robin Luckey [Mon, 8 Aug 2011 20:06:24 +0000 (13:06 -0700)]
Merge branch 'master' of https://github.com/earl/ohcount
Robin Luckey [Mon, 8 Aug 2011 20:02:06 +0000 (13:02 -0700)]
Merge branch 'rebol' of https://github.com/earl/ohcount into rebol
Robin Luckey [Mon, 8 Aug 2011 19:45:59 +0000 (12:45 -0700)]
Merge https://github.com/ecere/ohcount into ecere
Paulo Moura [Sun, 7 Aug 2011 01:32:11 +0000 (02:32 +0100)]
Added basic unit tests for Prolog parsing.
Paulo Moura [Sun, 7 Aug 2011 00:29:14 +0000 (01:29 +0100)]
Added basic unit tests for Logtalk parsing.
Paulo Moura [Sat, 6 Aug 2011 23:39:39 +0000 (00:39 +0100)]
Added basic support for Logtalk and Prolog (missing parsers in previous commit\!).
Paulo Moura [Sat, 6 Aug 2011 23:10:31 +0000 (00:10 +0100)]
Added basic support for Logtalk and Prolog.
John Clements [Wed, 6 Jul 2011 19:01:20 +0000 (12:01 -0700)]
adding racket, re-using lisp parser, following clojure's lead
Robin Luckey [Mon, 20 Jun 2011 15:28:44 +0000 (11:28 -0400)]
OTWO-803 Fixes disambiguate_pp() performance sink
disambiguate_pp() failed to execute in a reasonable time for extremely
large (1MB+) files.
The reason is that a regular expression is evaluated for each line of
the file, and this regular expression is scoped from the beginning of
the line to the end of the file. When the file is extremely large,
the regular expression evaluation runs away with the CPU.
By limiting the scope of the regular expression evaluation to no more
than 100 characters from its start point, we can avoid the runaway
performance sink. This is a reasonable change since the expression we
are looking to match should almost always fit within 100 chars anyway.
Andreas Bolka [Wed, 1 Jun 2011 23:40:11 +0000 (01:40 +0200)]
Fix filename in Go parser attribution line
Signed-off-by: Andreas Bolka <a@bolka.at>
Andreas Bolka [Wed, 1 Jun 2011 23:39:25 +0000 (01:39 +0200)]
Fix ragel include in parser example skeleton
Signed-off-by: Andreas Bolka <a@bolka.at>
Andreas Bolka [Wed, 1 Jun 2011 23:35:01 +0000 (01:35 +0200)]
Implement parsing of REBOL multi-line strings
Signed-off-by: Andreas Bolka <a@bolka.at>
Andreas Bolka [Wed, 1 Jun 2011 21:21:23 +0000 (23:21 +0200)]
Add REBOL detection and (basic) parsing
Also adds a simple .r disambiguation to discern REBOL and R sources. R
is the default, REBOL is used if "rebol" is found anywhere in the
contents.
The REBOL parser currently does not handle multi-line strings ({...}),
which could (in rare cases) lead to string parts being classified as
comments.
Signed-off-by: Andreas Bolka <a@bolka.at>
Jerome St-Louis [Sat, 21 May 2011 08:15:39 +0000 (04:15 -0400)]
Added missing detector test files
Jerome St-Louis [Sat, 21 May 2011 08:00:54 +0000 (04:00 -0400)]
Added support for the eC language (www.ecere.com)
Scott Lawrence [Mon, 18 Apr 2011 15:30:53 +0000 (11:30 -0400)]
add test cases for jam
Scott Lawrence [Mon, 18 Apr 2011 15:28:26 +0000 (11:28 -0400)]
Adding recognition and parser for perforce Jam (Jamfile/Jamrules), based on the parser for shell
Robin Luckey [Fri, 8 Apr 2011 15:51:18 +0000 (08:51 -0700)]
Merge branch 'master' of https://github.com/bytbox/ohcount
Robin Luckey [Fri, 8 Apr 2011 15:41:46 +0000 (08:41 -0700)]
Merge branch 'master' of https://github.com/chris-morgan/ohcount
Robert Schultz [Mon, 4 Apr 2011 16:32:44 +0000 (12:32 -0400)]
OTWO-571 Fixed a bug where ohcount would follow symbolically linked directories, causing problems/security concerns
Robin Luckey [Mon, 7 Feb 2011 21:55:54 +0000 (13:55 -0800)]
'build clean' should remove swig-generated file
Robin Luckey [Fri, 21 Jan 2011 15:46:45 +0000 (07:46 -0800)]
Fixes uninitialized data in tmp filename
The filename string used for the detector's temporary file had an
uninitialized byte at its end. Usually this byte is 0, so it has no ill
effect. Occasionally it can be a garbage byte, which can cause the
temporary file write() to fail.
Because Ohcount had been failing to check write()'s return value, these
errors went unnoticed, and incorrect line counts were silently returned.
I have fixed the uninitialized byte, and the previous commit adds
the appropriate error checks.
All code counted prior to this fix should be recounted.
Robin Luckey [Fri, 21 Jan 2011 15:44:28 +0000 (07:44 -0800)]
Fixes compiler warnings.
We were failing to check the return result on several system calls.
I've added the appropriate checks, with simple aborts in the case of
failure.
Robert Schultz [Wed, 27 Oct 2010 18:50:15 +0000 (14:50 -0400)]
Added support for detecting certain bash scripts as shell scripts
Chris Morgan [Thu, 9 Sep 2010 03:33:11 +0000 (13:33 +1000)]
Added a parser for NSIS files (.nsi, .nsh).
Scott Lawrence [Fri, 23 Jul 2010 20:27:07 +0000 (16:27 -0400)]
updated author information
Scott Lawrence [Fri, 23 Jul 2010 20:24:08 +0000 (16:24 -0400)]
Merge branch 'golang'
Scott Lawrence [Fri, 23 Jul 2010 20:23:55 +0000 (16:23 -0400)]
made README formatting consistent
Scott Lawrence [Fri, 23 Jul 2010 20:20:08 +0000 (16:20 -0400)]
added golang unit tests
Scott Lawrence [Fri, 23 Jul 2010 20:09:18 +0000 (16:09 -0400)]
added golang detection and parsing, based on C
Scott Lawrence [Fri, 23 Jul 2010 18:38:57 +0000 (14:38 -0400)]
added checks for gperf, erroring out if not found
Scott Lawrence [Fri, 23 Jul 2010 18:37:01 +0000 (14:37 -0400)]
added check for ragel
Ken Barber [Tue, 4 May 2010 02:12:26 +0000 (03:12 +0100)]
Added support for the Puppet DSL from Puppetlabs.
Jiri Matela [Thu, 15 Apr 2010 16:35:28 +0000 (09:35 -0700)]
Recognize .cu files as CUDA code; C parser for cuda files used
Jason Turner [Wed, 31 Mar 2010 18:10:29 +0000 (12:10 -0600)]
Add support for ChaiScript - adding missing files from last commit
Jason Turner [Wed, 31 Mar 2010 18:01:38 +0000 (12:01 -0600)]
Add support for parsing/counting of ChaiScript. ChaiScript is based on closely on support for JavaScript, as the langauges are similar in structure. All ChaiScript unit tests pass, but 1 or 2 (appears to be random) of the ruby diff tests fail.
Robin Luckey [Tue, 30 Mar 2010 18:30:26 +0000 (11:30 -0700)]
FIX - missing newline at end of languages.h
Robin Luckey [Tue, 30 Mar 2010 17:53:10 +0000 (10:53 -0700)]
Merge branch 'wirth'
Robin Luckey [Tue, 30 Mar 2010 17:49:38 +0000 (10:49 -0700)]
Implements Modula-2, Modula-3, Oberon
We recognize extensions m3, i3, mod, ob2, obn, def.
Ticket #30 requests that we also recognize extenions `m` and `d`.
That is not implemented here, because that requires disambiguation
from other languages, so ticket #30 remains open.
Robin Luckey [Mon, 29 Mar 2010 21:27:39 +0000 (14:27 -0700)]
FIX: Buffer overrun in emacs mode header logic
If the emacs mode header is not well-formed (for example, if it is
missing a terminating "-*-"), then we run off the end of our buffer.
The emacs mode header parsing is a little complicated, and the
best answer is probably to clean up the parser. As an easier, quicker,
fix, I simply added a maximum string length check.
Robin Luckey [Fri, 26 Mar 2010 21:52:03 +0000 (14:52 -0700)]
[FIX] Emacs mode "C" improperly detected as C++
detector.c uses the file extenstion map to convert emacs modes
to languages. Because the upper-case "C" file extension maps to
C++, this means that emacs mode "C" also mapped to C++, which was
incorrect.
I've added a special check in the code for this case. Emacs modes
"c" and "C" now correctly map to straight C code.