I am a high school sophomore and visited a professor in the molecular biology sciences who is also an expert programmer. (He builds robots, biochips, and a bunch of other cool gadgets.) In discussing other things, he urged me to learn Python programming, and now im very interested in computer science, and progressing rapidly. I know Python is a powerful language for its simple syntax. But if I want to go into bioinformatics (especially to use the tool http://www.ncbi.nlm.nih.gov/BLAST/) and computer programming in general, maybe even study cybersecurity, is it an ideal language to start with? or is C/C++ a better choice?
Name:
Anonymous2011-07-27 15:45
- Everything you write will be open source. No FASLs, DLLs or EXEs. There may be some very important instances where a business wouldn't want anybody to see the internal implementation of their modules and having strict control over levels of access are necessary. Python third-party library licensing is overly complex. Licenses like MIT allow you to create derived works as long as you maintain attrubution; GNU GPL, or other 'viral' licenses don't allow derived works without inheriting the same license. To inherit the benefits of an open source culture you also inherit the complexities of the licensing hell.
- Installation mentality, Python has inherited the idea that libraries should be installed, so it infact is designed to work inside unix package management, which basically contains a fair amount of baggage (library version issues) and reduced portability. Of course it must be possible to package libraries with your application, but its not conventional and can be hard to deploy as a desktop app due to cross platform issues, language version, etc. Open Source projects generally don't care about Windows, most open source developers use Linux because "Windows sucks".
- Probably the biggest practical problem with Python is that there's no well-defined API that doesn't change. This make life easier for Guido and tough on everybody else. That's the real cause of Python's "version hell".
- Global Interpreter Lock (GIL) is a significant barrier to concurrency. Due to signaling with a CPU-bound thread, it can cause a slowdown even on single processor. Reason for employing GIL in Python is to easy the integration of C/C++ libraries. Additionally, CPython interpreter code is not thread-safe, so the only way other threads can do useful work is if they are in some C/C++ routine, which must be thread-safe.
- Python (like most other scripting languages) does not require variables to be declared, as (let (x 123) ...) in Lisp or int x = 123 in C/C++. This means that Python can't even detect a trivial typo - it will produce a program, which will continue working for hours until it reaches the typo - THEN go boom and you lost all unsaved data. Local and global scopes are unintuitive. Having variables leak after a for-loop can definitely be confusing. Worse, binding of loop indices can be very confusing; e.g. "for a in list: result.append(lambda: fcn(a))" probably won't do what you think it would. Why nonlocal/global/auto-local scope nonsense?
- Python indulges messy horizontal code (> 80 chars per line), where in Lisp one would use "let" to break computaion into manageable pieces. Get used to things like self.convertId([(name, uidutil.getId(obj)) for name, obj in container.items() if IContainer.isInstance(obj)])
- Crippled support for functional programming. Python's lambda is limited to a single expression and doesn't allow conditionals, a side effect of Python making a distinction between expressions and statements. Assignments are not expressions. Most useful high-order functions were deprecated in Python 3.0 and have to be imported from functools. No continuations or even tail call optimization: "I don't like reading code that was written by someone trying to use tail recursion." --Guido
- Python's syntax, based on SETL language and mathematical Set Theory, is non-uniform, hard to understand and parse, compared to simpler languages, like Lisp, Smalltalk, Nial and Factor. Instead of usual "fold" and "map" functions, Python uses "set comprehension" syntax, which has an overhelmingly large collection of underlying linguistic and notational conventions, each with it's own variable binding semantics. To complicate things even more, Python uses the so called "off-side" indentation rule (aka Forced Indentation of Code), also taken from a math-intensive Haskell language. This, in effect, makes Python look like an overengineered toy for math geeks.
- Quite quirky: triple-quoted strings seem like a syntax-decision from a David Lynch movie, and double-underscores, like __init__, seem appropriate in C, but not in a language that provides list comprehensions. There has to be a better way to mark certain features as internal or special than just calling it __feature__.
- Python is unintuitive and has too many confusing non-orthogonal features: references can't be used as hash keys; expressions in default arguments are calculated when the function is defined, not when it’s called. Why have both dictionaries and objects? Why have both types and duck-typing? Why is there ":" in the syntax if it almost always has a newline after it?
- Python's garbage collection uses naive reference counting, which is slow and doesn't handle circular references, meaning you have to expect subtle memory leaks and can't easily use arbitrary graphs as your data. In effect Python complicates even simple tasks, like keeping directory tree with symlinks.
- Problems with arithmetic: no Numerical Tower (nor even rational/complex numbers), meaning 1/2 would produce 0, instead of 0.5, leading to subtle and dangerous errors.
- Poor UTF support and unicode string handling is somewhat awkward.
- self everywhere can make you feel like OO was bolted on, even though it wasn't.
- No outstanding feature, that makes the language, like the brevity of APL or macros of Lisp.
>>1
You can do anything in any Turing-complete language, but Python, C and C++ are probably not the best choice you can do. You should forget that C and C++ exist, actually.
>>9
There are actually enough bioinformatics-related libraries for CL. I would just tell OP to use whatever he finds comfortable. I find Lisp comfortable and thus I use it for my needs, if you find something else, you use that - of course, you do need to know your fair share of languages before making a choice, otherwise you might as well pick some inappropriate language and keep on using it (and wasting a lot of your time because some languages are good for more rapid development cycles, others are better for performance) because it's the only thing you know.
I see this degenerated into Lisp-1 vs Lisp-n trolling.
Name:
Anonymous2011-07-28 21:35
>>26
Those are all Scheme implementations, not CL. Let's face it, aside from the visual appearance of the code, Scheme and CL are entirely different creatures. Scheme is like C, simple and pure, and CL is like C++, badly designed and bloated.
Most large scale bio-informatics work is done in C++ as the primary work-horse language, plus maybe a higher-level language like Java for managing your distributed computing infrastructure (ie. sending work-units out to various nodes on your supercomputer cluster, and collating them once they've been processed, using something like with the Apache Hadoop framework). Also, GPGPU programming is becoming more and more popular, as newer supercomputers are usually built with them, so things like CUDA, OpenCL, DirectCompute and perhaps C++AMP in the near future.
Things like Lisp, Scheme, Haskell, etc. aren't used much outside of your typical undergraduate course.