Macha

76 IEs? Not Likely.

September 27, 2011 | categories: Programming, Technology

Paul Irish believes that by 2020, web designers will be forced to support 76 variations of Internet Explorer between the nearly annual release schedule IE is on lately and the compatibility modes each of these IEs will have for previous versions of Internet Explorer. ¹

IE6's entrenched position came from the fact that (a) it was the latest version of Internet Explorer for a huge amount of time and (b) its status as the IE dead end for Win2k and below.

When IE7 came out, any company that still had any Win2k machines had to keep designing with IE6 in mind if they wanted their new apps to work on all their computers. (I'm making the assumption that if they were relying on IE previously, they couldn't just switch to Firefox or something).

Now, IE8 I think most people can accept is going to end up in IE6's current place. It's the IE dead end for XP, a hugely popular OS. But IE7? None of those companies that don't upgrade upgraded to IE7. Home users that upgrade will also have installed the IE8 upgrade. So you're left with what? Unpatched Vista installations. These are much rarer than unpatched XP installations simply because Vista had a shorter lifespan, and Windows Vista to 7 is sufficiently undramatic an upgrade for the types of people who would take years to go from XP to Vista.

So so far we have:

IE6 will drag on as long as XP does.
IE7 won't last particularly long. While it's popular now, earlier Vista computers will be replaced in the close future (2-3 years), causing it to lose market share to IE8.
IE8 will have a long lifespan, although probably not as long as IE6.

IE9? IE9 has never been shipped by default with any version of Windows. That means anyone who installed it did decide to upgrade. These users will likely upgrade away, meaning in the future, IE9 will be even more of a non-issue than IE7.

IE10 will likely also go the way of IE7. While it will be installed by default on Windows 8, the amount of dramatic changes in W8 will scare off many of the companies that are slow to upgrade.

So in 5 years time, what versions of IE will realistically you need to support?

IE6 (maybe - probably, hopefully, enterprise only at this stage)
IE8
IE10 (enterprise will never use it because Win8 is scary and different to them so for home users only)
IElatest-1 So IE13 or something?
IElatest IE14 or something.

Needing to support IE6 and IE10 will likely be mutually exclusive, so that's 4 versions for sites targeted at home users and 5 for sites aimed at both enterprise and home users. Still ugly, but far from 76. And all those versions will be dead in the timescale that the article is using. Insofar as IE6 will ever die, anyway.

IE6 for home users will be dead at that point. Most of those old early XP computers will be "broken" and replaced, even if "broken" is just slow and annoying. Using XP in five years will be like using Win98/Win2k. Yes, people do use them. No, they aren't a large enough group for most to worry about. I even have a small amount of hits from Netscape 6. I haven't a clue what my page looked like for them, and don't care.

In theory, if even IE is aiming for at least yearly releases from now on, no future IE will end up in the position that IE6 is in, and that IE8 will find itself in, as upgrading your browser frequently becomes a fact of life. The compatibility modes will be much less important too, as the shorter lived the browser, the less likely that the compatibility mode for it will ever be used.

This post was originally posted as a comment on HN. Check the thread for possible replies.

Not Invented Here and New Programmers

May 08, 2011 | categories: Programming

The general consensus is that the one of the best ways to learn how to program better, beyond learning the basic syntax is to just go ahead and write some programs on your own. Another consensus is that the best type of program to write is one that scratches your own itch. Yet another consensus is that it is best to avoid the "Not Invented Here" syndrome of writing everything from scratch and instead reuse as much code as possible.

However, for someone learning to program today, for the most part, large parts of any itch that could be scratched has already been done so by someone else. Usually, this is someone who has done a much better job of it than any newbie code. This means that if they follow the advice regarding NIH, they would be reduced to writing glue code for quite some time, tying together libraries written by other people while they program relatively boring code.

The problem with this is twofold:

The newbie doesn't really learn much about designing their own programs. Sure, they see how the (hopefully) well written libraries do it, but they don't get to see the thought process required, or any of the refactoring that removes earlier bad design decisions.
Glue code is boring. How many people are driven away from programming by this experience?

One of the first big programs I wrote personally was a PHP social network. I did many parts of it from scratch - a database abstraction library, a templating system, even a primitive MVC system¹, and so on. The code that resulted was horrible, and probably riddled with security problems, but I learned a lot from the process.

How would a newbie do that nowadays? They'd install cakePHP or Django or Rails, getting them a DBA layer, templating, MVC all written for them. And sure, the resulting program will probably be cleaner and less buggy, as all the hard parts are handled by the framework. But the job of the programmer gets relegated to writing some models and a few views that are pretty much the same everywhere.

But then a lot of the benefit from trying to write a program for themselves just isn't there. They don't get the benefit of finding out why some ideas don't work, as they just use the framework. Most of what they learn are the APIs of the framework, not any of the thought processes involved in creating the program. Not to mention, while people writing more complex programs than anything a newbie might be delighted to have that problem taken out of their hands, as they figure out how to make their database not fall over with 20k users.

The fun of programming is solving new problems. That's why Rails and co. are so popular. After you've written your second or third wep app, then these problems quickly become old problems, and having Rails handle them for you is really convenient. But for a new programmer, they haven't solved these before. Their programs will not be as ambitious (as anything that ambitious will most likely be dismissed by them as too hard), so having Rails solve these problems will leave them without any problems to solve.

Of course, another benefit is that when they are finished, they can go look at the existing solution to see how it compares to their own solution. This can help them to compare the thoughts that led to their own solution to the (presumably) better result of the existing solution. If they managed to find some itch that hadn't been scratched by someone else already, they wouldn't be able to find a sample to compare to their own work.

In short, laying off the avoidance of "Not Invented Here" can help newbies learn quicker in many cases than being relegated to writing glue code.

Before I knew what MVC was, so it ended up as more of a VC system

The Magic Layer

January 16, 2011 | categories: Programming

Any sufficiently advanced technology is indistinguishable from magic.

Arthur. C. Clarke

As a programmer, and a computer nerd, I obviously know more about how they work than if you just grabbed a random person off the street. As such, it's always interesting to see how people with less knowledge about computers use them.¹ To a lot of people, they click some buttons on the screen, and then the computer does something that may as well be magic, and finally results appear on screen.

However, it's not just those unfamiliar with technology that find that at some level, the processes going on may as well be magic. Very few people understand the entirety of how things work from top to bottom. Moving on from the unfamiliar user and the buttons they click, for many poorer developers they type (or copy and paste) in some code they don't really understand and somehow it all works. Their magic layer is lower than that of the user who doesn't understand how the program actually works. After all, they understand that programming code needs to be written, but they still don't understand any of what goes on beyond that. Their magic layer is off in the function calls, in the syntax on the language, in the meaning of the code. They may understand small sections, but the entire program as a whole is still magic to them.

Moving on from the poorer developers, novice developers, who instead of incompetence only have the problem of knowledge they don't yet have. They understand for the most part how their code works. They know how to structure their functions, they know when to use objects (and when not to), they can create programs from scratch. But they may not quite understand what is going on in the library routines, or what happens when the compiler creates a program from their code.

As you go further up in the skill of the developers, the magic level recedes. They realise that the library routines are just like functions they create themselves for the most part. Some might make system calls, but other than that, there isn't much magic there. They understand that the compiler reads their code, and produces machine code which is run by the CPU. The magic is banished from their code to the inner workings of the compiler/interpreter and the operating system.

Finally, once the developer learns how those last few retreats of magic work, they can understand pretty much the whole picture, as far as software goes. However, this still isn't the final end of the story. Sure, the magic might be gone from the software side, but there is still the hardware. How do those system calls translate to data being written to disk? How does the CPU know that MOV eax, ebx moves data between registers? And at this point you're in the hardware layer.

So, where is your magic layer? Personally I'm at the "inner workings of the compiler/OS" stage, and to work on pushing it further back, I've found some useful online resources. As a self taught programmer, most of the resources I've had up until now glossed over these areas, and it's my main aim for this year to push the magic layer back into the hardware. For compilers, I've found Jack Crenshaw's Let's Build a Compiler useful to start with, and I'm currently reading through that, and for operating systems, a bunch of useful articles have actual coincidentally popped up on Reddit and HN recently on that topic, though I'm still open for suggestons on other resources.

Some further discussion of this post can be found at Hacker News and on Reddit.

Except when they frustratingly do everything in the most convoluted way possible.

A quick, basic primer on the IRC protocol

October 14, 2010 | categories: Programming, Technology

While working on my Haskell IRC bot, I needed some information on the actual IRC protocol. Much of this information sadly isn't available in any centralised format, and much of the information that is there is just a copy/paste of the RFC. There are two formal descriptions of the IRC protocol, an older one (RFC 1459) and a newer one (RFC 2812), though the actual protocol as used by most servers doesn't adhere exactly to either of these. So, here is a short summary of the information that I have gathered in my research. This is by no means a comprehensive tutorial, but it is sufficient to write a basic IRC bot.

The first part of the IRC protocol is the rough layout of messages. The first, optional part, is the source (username and hostmask, or server) preceded by a colon, as in :holmes.freenode.net . Because they only deal with one other part of the network, the server, clients will rarely, if ever send this part, while servers nearly always will.

The next part of the protocol, seperated by a space is the command name, which is in all-caps. Most of these are pretty much the same as what the user types in after the /. For example, the /join command becomes JOIN. The most notable exception is the PRIVMSG command, which is used for sending a message to a user or channel (it's the same command for both).

After this come the arguments for the command, again, space seperated. Most of these are limited to one word values. The one exception is the final argument, which can have more than one word, and is started off by a colon.

There are a few types of channel, but nearly all the channels you will encounter are of the #channel variety, so we will not go into detail on other types.

Finally, the command is terminated by \r\n, not \n according to the spec, though it seems most servers will accept either.

An example of a full message is as such:

:Macha!~macha@unaffiliated/macha PRIVMSG #botwar :Test response

The first part of any IRC connection is sending the NICK and USER messages. The first of these is simple, just NICK name. The next is the USER message.

An example of a USER message is:

USER username 0 * :Real name

The * part is a remnant of earlier days, and will not need to be changed. The 0 is a bitmask for the user's mode, but with just one switch. Change it to 8 to be invisible to those not in a channel with you.

The next part of the protocol we will discuss is the PING message, because some servers need one immediately after these two messages. The server will send you a message in the format PING :message to which it needs a response of PONG :message. This is the most common case of a server not sending a source. Most servers use the server name as the message part, but this isn't consistent.

For all of the rest of these messages, there is a source on the other messages from the server side. This is a user and hostmask for a user's message, and a server name otherwise. If you are writing a client, do not send the :source part.

The next message to deal with is JOIN. The basic format of this message from most servers I've tested is

:source JOIN :#channel

although the spec says otherwise as regards the need for the colon. This is pretty self explanatory, and works the same as the /join used in an IRC client. The one unintuitive part of this command is that JOIN 0 leaves all channels.

Its counterpart is PART. Its format is

:source PART #channel :reason

The reason part is optional, and some servers (for example Freenode) seem to just cut it off, as it did not exist in earlier versions of the IRC protocol.

Both of these messages can also accept a list of channels separated by commas when sent from the client, for example PART #channel1,#channel2. Don't put spaces between the channels in this list.

The most important command in the IRC protocol is PRIVMSG. This command is used for sending messages both to channels and between users. Its format is

:source PRIVMSG <target> :Message

where the target is either a user's nick, or a channel name. So to send a message to a channel, use PRIVMSG #channel :Hello, World and to send a private message to a user, send PRIVMSG Nick :Hi!.

The final message you will use in basic usage is QUIT. Its format is

:source QUIT :reason

where the reason is optional.

These commands are sufficient to write a basic IRC bot but are by no means the full list. There are also numeric commands, and you can find more detail on these here.

A Self-Taught Programmer's Journey

September 27, 2010 | categories: Me, Programming

I was helping someone who is just beginning to program over the past few weeks, and it led me to actually write a blog post I'd been saying I should write for a long time, about how I got where I am today as a programmer. This is mostly from memory, so some of the timing may be wrong.

My first experience with anything even close to programming was when I was around 12. I had gone through all my mum's ECDL stuff for MS Office years before, and there was one program in Office on my then computer that wasn't even mentioned - Frontpage. So I got bored one day, and decided to check what it was. I gathered quickly enough that is was for making websites, and I sort of got parts of it, but I had to get a book from the library to figure it out fully. Armed with this newfound knowledge, I managed to make a basic website, a horribly primary-coloured frame-filled website, but a website nonetheless.

Of course, my instinct was to show it to people, so I found out about Angelfire for hosting from an even older book, and uploaded it, then showed it to people online. Needless to say, people didn't like it very much. But a few people did give me some somewhat useful advice, to drop Frontpage. They mentioned something called Dreamweaver as being better, and also suggested learning HTML. Which I did, until eventually I was using Dreamweaver as a glorified text editor.

Meanwhile, looking for the next thing to do, I decided to learn actual programming, in the form of Java. In retrospect, not the best choice of language for starting out, but it was one of the few that my local library had books for. The other alternative if I was to stick to library books was Visual Basic 6.0, so it was definitely the better of the two options. I went through the whole book of Java All-In-One for dummies, went through it, did all the examples, got to the end, and wrote a little Java clock applet for my website. I was so happy with myself, that I decided to go even further, and my next plan of a project was to make a Simcity clone in Java. Unsurprisingly, that didn't work out too well, and the failure put 13 year old me off programming for a while.

By this stage, I had long cured any illusions of coding being anything like the Matrix

The next time I came into contact with programming was when I was running a forum with a few friends. It ran off phpBB, and when installing mods, I noticed that all I was doing was editing programming code, which was kind of familiar. I looked into it a bit more, found out about PHP, and then quickly learned that. It was significantly easier to learn, both because it was my second language to learn, and also because languages with weak dynamic types are easier to learn in my opinion.

Of course, the problem with learning PHP, especially if you use the phpBB 2 code base as your example is you pick up bad habits. Now there's nothing inherently wrong with the language nowadays, but this was the PHP 4 days, with it's half-assed attempts at OOP. On top of that, something that is still a problem today is the amount of bad material floating around. Now, this is a problem with any language, but given PHP's large userbase, and popularity with hobbyist coders, it's far more pronounced in PHP than others I have used. So code found on the internet is ripe with a lack of care towards seperating output and processing, full of SQL injection holes, and often even relying on register_globals. But this isn't anything new. But it does lead me to recommend to everyone trying to get a good book. I started with this one, and while it did help me learn PHP, it set me back a long distance in terms of proper coding practices. Eventually, I got this book to learn how to code properly in PHP.

Around that time I started reading programming blogs, and found out about source control, in the form of Subversion, a rather important topic that no programming book seems to mention. Seriously, apart from Code Complete, which mentions it in glancingly a few times, the only book on my shelf that explains source control at all is the Linux Bible, 2009 edition. And that's talking about CVS. Now for all Subversion's faults, any source control is better than no source control. Later I was introduced to git and DVCS in general by some friends, and it's now my preferred form of source control (git for all Linux projects, hg if I have to work with Windows users). It didn't exist at the time, but for anyone looking for a good DVCS tutorial now, I reccomend Joel Spolsky's hginit. Even if you plan on using git or another program afterwards, it's the best basic explanation of the concepts of DVCS I've seen.

This was the only book to mention version control. Going back to my own journey, after that time in PHP, and several ill-fated projects, countless attempts at building a forum, a incomplete social network, and a completed, if basic, social network that I failed to get anyone to join, I moved on to Javascript. I spent ages messing about in that, making small scripts to move things around, before getting more ambitious. Yet again, I returned to the idea of a simcity clone. Given that this was before canvas became well supported or known about, the core of the program used a 200x200 table for the game's grid. It was every bit as slow as that sounds.

After a while of this, I returned to trying to make GUI desktop apps. I know, I know, many considering this a step backwards from web apps, but at the time, I couldn't think of many examples of web apps that had progressed of the "Oh, that's mildly interesting" stage. I had used Swing before, in my failed Simcity game, but wasn't particularly impressed by it. I gave SWT a go, before giving up again. I guess my problem with GUI apps resolves from my desire to control every last detail, compared to the designer-tool-centric design of most of the GUI toolkits. Somewhat bored with GUI apps again, but also bored of the general dislike of my main language, PHP, I went on a quick tour of several languages, including C#, C++, Ruby, Python, Ruby again, before finally deciding on Python as my new favourite.

Around this time, I switched to Linux as my primary OS. I'd tinkered with Linux for a while before this, but had always been too used to Windows to actually make the switch until then. Since then, Windows has been mostly relegated to use for gaming and iTunes.

I spent a while coding in Python, before going on a short hiatus from programming for a while, being somewhat bored as all the languages I learned being roughly the same at this stage, and no new ideas for a project to keep me interested. Projects are another thing I recommend to anyone starting to teach yourself programming, as even the failed ones or ones you get bored of are a good learning experience, especially if you have others looking over the code.

After a while, I decided to learn a less mainstream programming language, and after a quick straw poll on Twitter, Haskell emerged the winner. Luckily for me, Haskell has a good beginner's tutorial, Learn You a Haskell. I've heard it described as like Why's Poignant Guide, but it's not really. It's much more to the point which works for me (I'm sorry to say I found Why's Guide a rather boring read, having too much unneeded meandering with stories. But I accept it's good for others). It is a rather basic tutorial, but there is also Real World Haskell for when you need to go further, even if it isn't as well done.

And now, my current project is to write an IRC bot in Haskell to learn Haskell better. I'm still learning and improving, and it's going to be a handy headstart for college that I have all this done already, so the story goes on.