Nov 7

The Fallacy of Industrial Expectation

Author: s1n
Category: Grinds My Gears, Project Bootstrap

I am what you would call a “professional student”; I have an Bachelors of Science in Computer Science, I am finishing a Masters of Science in Computer Science with a focus in Intelligent Systems, contemplating a PhD, and I have been a software engineer for 6 years now.

Recently, Joel Spolsky published yet another article about how he feels the universities of the world are churning out students incapable of doing the daily duties of software development. I’ve read other scathing articles about academia. I’ve even responded to many comments similar to “you have a degree but don’t know how to use ToolX or program in LanguageY.” These criticisms always irritate me (and strike me as originating from someone who begrudges those with degrees), so I want to set the record straight about academia. There are 2 simple points I want to get across:

  1. The university’s primary concern is to teach you core knowledge and how to obtain new knowledge in any field.
  2. Computer Science is a division of Applied Mathematics.

It’s that simple. At no point is it the university’s responsibility to teach students an arbitrary tool or language that the industry is consistant in its opinion. I read this really great response to his article that echos many of my complaints about this misconception.

Joel often comments that universities are trying to teach a particular language because it’s what the industry does or because MIT does it. That is wholly incorrect. The only reason why a university favors a particular language is so that the professors can focus on teaching towards and grading just one language, as that greatly simplifies their job. The choice of Java or Python is because you can express ideas simply and straight forward. The point isn’t to teach a language but to teach an idea expressed in a language.

If my undergraduate university had taught me specifically how to use CVS, that skill would essentially be wasted. Instead, they teach how versioning control systems work so that I may either implement one or just use one in my day to day job. Which sounds like a better idea in the long run?

Now keep in mind that CS is really just a division of Applied Mathematics. If you haven’t come to understand that, then you do not truly understand the field. In fact, the original “computers” were humans who computed mathematic equations.

Sure, most undergraduate assignments seem simple in comparison, but it’s because they don’t want to teach the peripheral tasks. Those tasks, such as testing, working in teams, and the latest “agile” techniques, are unrelated to the core understanding and vary widely within the industry. By understanding the core concepts, everything else is an extension of your existing knowledge.

If my university had taught me how to use FogBugz or how to write Perl TAP tests, I would have been looking for another university. My graduate school has yet to require I use a language or a tool and has yet teach a specific language and a tool. In the long run, that makes me a more adaptable developer and far more valuable to my employer.


tags: , ,
No comments

Oct 31

Pump King

Author: s1n
Category: KISS Saves Santa, Permission For Flyby

I am now a PumpKing. No, not that kind of pumpking, this kind:


pumpkingpumpking2

I stole the Perl Foundation logo and created a stencil from it using GIMP. It took me approximately 3 hours to create that pumpkin.

Happy Halloween!


tags: , ,
3 comments

Oct 25

Thesis In Frustration

Author: s1n
Category: 01100011, Grinds My Gears, Project Bootstrap

So this semester I have been investigating and working on my thesis. Right now, my focus is in Statistical Natural Language Processing. I don’t want to discuss the specifics of the research just yet, but it has the potential of completely up-ending the entire search industry.

I have been investigating how to build a large corpus from the web. My advisor favors using Google directly since they already exist and provide their search for free.

The first thing I did was investigate the Google SOAP API only to find out that they deprecated it when they introduced the AJAX API. The new API only allows for about 60 results with no paging. Then I looked into the REST::Google API, but that only returns 10 results. Neither of those options seem feasible. I checked Yahoo’s Yahoo::Search interface and it only seemed to return 10 results (paging, if possible, was not obvious). I could write a direct scraper but that would take a good deal of effort and I am not sure it would be worth it.

Then I even started looking at writing my own spider using WWW::Robot. This is a fairly complex module that does a ton of grunt work for you. The downside is that it behaves and follows the robots.txt protocol; that’s a problem for someone who wants to scrape everything with no regard for such a protocol.

I spent maybe about 20-30 hours flipping over this in the last 6 weeks, I finally made the effort to meet with my advisor. Since he is no longer answering email or his phone, I met him after his late class and talked it over with him while he ate dinner in the campus restaurant. We talked and waffled back and forth about our approach. In the end, we decided to investigate Lucene’s capabilities.

Frustrated and lost, I went about my week until I talked with a PhD student currently being advised by my advisor as well. Her patience for our advisor has been continually declining. She missing a publication deadline because he failed to review a paper of hers. She also divulged that she intended on switching advisors because she is not making progress. I have been contemplaing this myself, so it was good to hear that I am not the only one at their wits’ end.

I am not making progress and I am not willing to sacrifice my graduation. If I change advisors, hopefully I will find an advisor that provides much more support and direction yet gives me the option to continue developing in perl. One of the professors I want to speak with runs a programming language lab.

Maybe I can merge my interest with Perl 6 with my thesis!


tags: , , , ,
2 comments

Oct 7

It Must Be Magic

Author: s1n
Category: Other

Apparently Ovid never heard of Clark’s Laws:

Any sufficiently advanced technology is indistinguishable from magic.

I’m sure people thought the same thing about black powder, mechanical engines, computers, TVs, and well basically everything else that has ever graced his lifetime. Remember Ovid, there was a point in time when structured programming reigned supreme.


No comments

Oct 4

Falling Behind

Author: s1n
Category: 01100011, Project Bootstrap

I dropped off the Perl Ironman blogging challenge again. This time it wasn’t due to a date miscalculation; I took a midterm exam on Thursday. My current class has taken up most of my time in the past 6 weeks. There haven’t been any chances for coding just yet. I did want to talk about something I have been looking into lately for my thesis.

First, I want to post a lazyweb question: has anyone worked with REST::Google? My first question is if anyone knows how to advance the page cursor with this module. I tried reading the code itself, but it uses Class::Accessor and Class::Data. I’m a bit unfamiliar with those modules (are they popular anymore?), and it looks like the cursor is read-only. I don’t really see the use for the module if it returns 10 results and cannot paginate.

So from that question, I took the sample and tried playing with it. This is just experimental to exercise the module’s capabilities. This is fairly boring, so I want to see if I can get some code working that can paginate through Google results using their REST API (if that’s even possible).

The author of this module probably feels very clever; he hid subpackages from CPAN by putting the package name on a new line from the package keyword. They also used __PACKAGE__->mk_ro_accessors, which looks like it will generate attribute accessors at runtime. I’m guessing CPAN cannot index that as well. What’s the point of uploading your code to a public repository if you take measures to hide it from the repository?

Anyways, I’m soliciting ideas for paginating Google’s REST API results. Note: the SOAP API has been terminated, so that avenue is closed.


tags: ,
2 comments

Sep 9

Ampersand

Author: s1n
Category: 01100011, Meat Space

Today, only fRew and I ended up meeting for September Dallas.p6m meeting. Turns out several people forgot that today was in fact the Tuesday after Labor Day and not Monday. We mostly just sat and talked about miscellaneous software things. We discussed GPS, my Algorithms course, Parrot, PGE, and styling. Styling is the last thing that we discussed and one that seemed semi-heated.

I discussed my reasoning for my styling quirks, which fRew insisted he would replace in a heartbeat. I’ve mostly been honing my preference for certain things by finding bugs using styles that seem to lead towards common mistakes. For example, I always place if statement parenthesis immediately after the ‘f’ because you cannot have one without the other (well, you can have the boolean expression alone but it rarely makes sense unless it’s a return value). I tried to apply the same reasoning to why I put my braces at the end of the if line with a space between the ‘)’ and the ‘{‘. I do this because the block started by ‘{‘ can exist without if clause. I always keep the ‘{‘ on the same line so attempts to comment out the if clause will fail to compile as I’ve found and fixed too many bugs as a result of true laziness.

Now that may seem kinda wierd, but it’s 1) a mental seperation technique and 2) an attempt to reduce the number of standalone nested blocks (that can do odd things like cause variable scope issues).

Then we started discussing using ampersand, ‘&’, to begin functions. My reasoning is because I more often than not prefer to be as explicit as possible, and the ‘&’ let’s me do that. I failed to recall an example that led me to preferring the use of ampersand, but I eventually found it. Basically, functions should look like function calls and not keywords, macros, or other environment lexicals. &foo($arg1, $arg2); looks a bit hairy and dated (generally a perl4 way of doing things), but it’s clear from the first character what is about to happen. My brain needs only to parse the first character to read the code with the right mindset. I am calling a user defined function (not a built-in), named ‘foo’, and passing 2 arguments. That is clear, readable, and will likely work for the forseeable future; if not, it’s still easy to find and correct.

On the other hand, foo; or foo(); (under strict) is not necessarily clear. The first example is basically a bare-word and could be any number of things. It could be a symbol, a package, a string, or a function call. The arguments passed would be @_, which requires more investigation. The second one looks like a subroutine call but I have to parse 5 characters and then grep around for a sub named foo within the current namespace (was it loaded elsewhere and exported to my namespace?). While both of these are more compact and concise, they also both require more work to figure out exactly what is happening.

Also, foo($arg1, $arg2); is more clear but not until I’ve read the minimum of 4 characters to I start to think it might be a function call. This does not parse and skim nearly as quickly, at least to me.

All of this skimmable code talk (note: I don’t agree with Schwern, end-of-scope comments usually clutter code more than they help) may sound frivolous to those readers who deal with thousands of lines of code. It’s not something you can truely appreciate until you maintain code that weighs in with at least 9 digits (executable code only). I personally manage 250,000 lines and I am responsible for a product that is about 2 million lines (all 30 of our branches are about 2 million lines each).

In the end, I stick by my preference for ampersand function calls unless someone else can point out a better reason to ditch them.


tags: , ,
6 comments

Aug 30

Low Hanging Fruit

Author: s1n
Category: 01100011, Meat Space

Today we held the first Perl 6 mini-hackathon. Ironically, there was another Perl 6 hackathon happening in California. A few problems cropped up but the end result was a positive experience. I’ve never been to or held such an event, so I wasn’t sure what to expect.

First, the location we chose, Borders changed their wifi system. Since Seattle’s Best is owned by Starbucks, they switch to the T-Mobile non-free service. We ended up moving next door to Market Street, the location of our usual Perl 6 Mongers meetings. Unfortunately, this location is much louder than Borders, but we made it work.

Then one of the attendees was unable to connect to the wireless network. We ended up spending about 30 minutes trying to help out but ended up having move on. Next time, I will make sure to bring Rakudo and Parrot on a flash drive.

fRew and I set up a fresh Rakudo build. I ran the spectest and subsequently, it crashed my machine. Yeah, it locked up my machine and I had to reboot it when I got home (remotely logged in to my desktop development system). I couldn’t find anything in my system’s logs and the test suite was run in parallel. It seemed to stop spitting out output while it was working on S05-named-chars.

Eventually, it was just Patrick and I. We came up with a system for labeling RT tickets that should be relatively quick fixes. First, you have to filter out tickets assigned to Jonathan; these are typically more difficult issues to fix. Then we decided that we would tag the tickets “LHF” to mean Low Hanging Fruit. For now, you can find “[LHF]” in the subject but we will eventually have a tag created similar to the tests-needed tag.

All and all it was lots of fun. Next time, I will make sure to have an install CD or two, Rakudo and Parrot on a thumbdrive, and find a quieter location with wifi. Next time, we’ll hopefully be able to spend more time working on the LHF tickets.

The next event will be next month, two weeks after the Dallas.p6m meeting. The next Dallas.p6m meeting is slated for September 9th and the next Mini Hackathon is slated for September 26th. We need to setup our website and a calendar.


tags: , ,
6 comments

Next Page »

WordPress Loves AJAX