Software

Creating your own personal aspell dictionary

Something that has bothered me forever is that applications that use GNU aspell for spell checking kept marking my name as a misspelling (I’m looking at you, KMail). Most front-end applications don’t provide a way for you to add your own custom words.

Apparently, creating your own personal dictionary is ridiculous easy with aspell.

If your language is English, create a file in your home directory called ”.aspell.en.pws”:

personal_ws-1.1 en 0
Samat
quasirhombicosidodecahedron

The first line is a required header. Every subsequent line is a word you want to add to your dictionary. I can’t believe I’ve let this sit for so long. Because it’s a nice text file, syncing this file between machines to take your dictionary with you is trivially easy.

Amarok 2 uses MySQL embedded as a metadata store

There’s been a bit of turmoil in the Amarok and KDE communities the past week with Amarok’s decision to only support MySQL Embedded in Amarok 2. Jeff Mitchell has written about the Amarok design decisions made.

I’m a little bothered by this, as it forgeos all the “semantic desktop” work that has gone into KDE 4, namely what’s provided by the Strigi and Nepomuk libraries. One thing the whole semantic desktop concept entails is that other applications will be able use data another application stored, but without care to what that other application was or how it was stored. For example, I should be able to share the list of all tracks in my music library, how many times I’ve played tracks, what tracks I think are my favorite, etc across music players. This kind of abstraction is, obviously, good for users, but bad for developers of proprietary software. They don’t want you to easily switch between applications that they do not control. Amarok switching to it’s own database store is a stab at this kind of desktop interoperability. I’ve my own thoughts to add, though, that support what the developers are doing…

Amarok is an awesome application. Dare I say, it’s a killer application on Linux—on several occasions this past year I’ve recommended people install Linux just so that they could play with Amarok and see how much better it is compared to what they were using (yes, I’m looking at you, iTunes).

Before Amarok, I used Music Player Daemon (mpd). I stopped using it after a while: the playlist management wasn’t very good; it would eat those playlists that I spent a lot of effort to make; the GUIs available at the time were lacking; and it was very slow when working with tens of thousands of songs. Some of this may have changed but I’ve not been motivated to look back.

Enter Amarok: I switched because the playlist management was so much better. I setup a MySQL server on my workstation to store metadata, as SQLite was much too slow. Amarok backed with MySQL is very fast—I dare others to find a library-based music manager that is faster with the number of songs I’ve thrown at it.

Balancing desktop interoperability with performance is a delicate balancing act. Interoperability is the hot thing these days—look at how Apple’s line of integrated software and hardware continue to sip market share from the Microsoft-powered desktop. But when it comes down to it, performance and other more perceived benefits are going to win out over desktop interoperability. The Amarok developers’ decision to go with MySQL embedded is a good one that will hopefully keep people moving to Amarok over proprietary alternatives.

jQuery: the new defacto Javascript web framework

News from a couple days ago: both Microsoft and Nokia are now including the jQuery Javascript framework as part of their development kits. That is: jQuery will be part of Microsoft’s ASP.NET AJAX framework and be available for use in applications written for ASP.NET; and jQuery will also be distributed on millions of Nokia phones.

Defacto standards, I believe, are a good way to inform the development of real standards. Standards developed the other way around, at least in the tech industry, have had a habit of taking a very long time to reach end consumers… for example, how many decades has it taken for your average web user to gain access to a fully CSS2-compliant web browser? How many more decades will it take for OASIS’s OpenDocument format to supplant Microsoft Word and its *.doc files?

Hopefully, this is the beginning of a path that will lead to jQuery’s inclusion into the Javascript language, as well as initiatives that will improve jQuery’s performance.

I like the fact that Microsoft and Nokia are not trying to reinvent the wheel, and roll their own Javascript frameworks. Sun did this with Java Server Faces. A frequent lament with JSF is that it’s nearly impossible to customize any of the widgets. There is too much complex, custom Javascript, and the adoption of the frameworks used makes figuring out how to work with them difficult.

Also, as others have noted, this is the first time Microsoft itself is distributing an open-source project with one of their products. A sign of things to come?

Has the war on spam been lost?

O’Reilly Radar has an article written by Dale Dougherty, a roundtable set of opinions on whether the war on spam can be won. Rafe Colburn also has his own response.

Rafe’s solution is to use GMail. In the Dougherty’s article, Paul Vixie mentions that the internet is going to become a “walled garden;” relying on proprietary technology provided from a single company is the same thing in my eyes. There’s no way I’m going to advocate a proprietary solution for something as important as my e-mail.

Eric Allman mentions DKIM, which I think is an excellent weapon in the war on spam. I’m not using it however, as it doesn’t fit in with the way I use e-mail, and MUA (e-mail client) and MTA (e-mail SMTP server, essentially) is extremely sparse.

My unfortunately ineffective and impractical solution to this problem is use of PGP. Besides identity verification via digital signatures, it is also a generic platform for encrypted digital communication, and provides a distributed, robust trust model. Unfortunately, its learning curve is high, and that is why it’s basically been a failure for the past 10 yrs.

Though, lack of user education is why the spam problem keeps getting worse too. It’s users who click links in spam e-mail; it’s users who allow spammers to take over their machines through their negligence in applying security updates; it’s users (sometimes) who allow their identities to be stolen.

New MediaWiki skin based off of MySkin: NullBook

I’ve created a new theme for MediaWiki-based sites (by proxy, Wikipedia as well) called NullBook.

I do not like Monobook; I feel as if it wastes too much space on the screen, difficult to read, and doesn’t follow established web usability guidelines. Some of NullBook’s features:

  • Fixed-position “tab bar”, set to use GUI colors, that stays with you as you scroll down a page. The tab bar contains the contents of the “Views” menu, as well as the go/search box.
  • No font size definitions; uses your browser’s default. Skin (mostly) scales well with font size changes.
  • Underlined links, and more recognizable link colors. Blue for unvisited links, purple for visited links, and green for broken (or uncreated) links. Moving over a link highlights it to red.
  • Removed sidebar to stop wasting vertical screen real-estate, and relocated it to the bottom of the page. Also hid the other languages list since I tend to only look at English Wikipedia articles.

You can find more information about NullBook (including screenshots) by looking at the NullBook section of WikiMedia’s Gallery of user styles.

Misery with online reading of PDFs and the need for portrait monitors

In the process of writing a term paper for a class, I’ve been paging through many research papers.

Unfortunately, many of these research papers are only available for reading via PDF. Even for those papers that have full text on a normal webpage, complex login and authentication systems (i.e. I can only access said page through my university library) force me to save PDFs to facilitate later reading.

PDFs are really miserable for reading on the computer. My gripes:

Fixed font styles
Many PDFs use serif fonts, which are generally difficult to read on screen (though fine on print media). Some irate designers even create PDFs that use “Times New Roman,” which despite it being default on many web browsers is ugly and difficult to read. In a web browser, you can change it; in a PDF, you are forced to suffer with it.
Fixed font sizes
Font sizes are fixed in PDFs, you cannot change them. Often when reading on screen, fonts are just too large, or are too small. This is compounded with…
No wrapping
Text is statically laid out, so you are completely reliant and sizing your window and adjusting your zoom to be able to read a block a text, or stuck with moving your scrollback back and forth.
Columns
Computers have scrollbars. Columns make absolutely no sense when you can scroll. The worst case comes up when you combine columns AND scrolling: you have to scroll down to finish reading a column, and then scroll back up to begin reading the top of the next column.

Usability expert Jakob Nielson thinks so too: in 2003 he had a column PDF: Unfit for Human Consumption.

It seems that some of these problems stem from a mismatch in orientation. Computer monitors are generally landscape; PDFs and printed media are portrait.

And computer monitors just keep getting wider. While widescreen is nothing short of awesome for movies and television, its not that useful for computing. The classic use case is the accountant with a wide spreadsheet: but how many people have wide spreadsheets? Because most people use computers to create content in a portrait orientation, and that most content we read expands downward rather than to the side, it seems as if it would make sense if monitors were a portrait orientation rather than landscape.

Fortunately, this is easy to try out now. Most LCD monitors swivel into portrait orientation with a flick of the wrist. Microsoft Windows and Linux (through the XRandR extensions) have provided orientation switching support for a few years as well.

But it’s not yet usable by the mainstream. For example, on Linux with nVidia’s binary drivers, running in portrait means losing out on accelerated 3D as well as multimonitor support, things many people (including myself) are not ready to lose.

Giving up on my bookmarks system and joining del.icio.us

I wrote my bookmark system a few years ago because I had no good way for sharing bookmarks online, or amoung web browsers on different machines on different platforms. I’ve not ported it from the old site to this new one, and I’m not sure I care… While my bookmark system did what I wanted it to do, it was not flexible. I look at the PHP code I wrote and remark: I hate this.

So, I now use del.icio.us. Am I now a Web 2.0 (I hate that term) loser now?

Bought my 15th domain--randomized.info

I have bought my 16th domain name today, randomized.info, for a project that I am going to do some day soon.

The way I see it, domain names are like Internet real estate. And indeed, some people market them like this (though I wouldn’t). The domain name system and the top-level domains everyone knows and loves (.com, etc) are not going away anytime soon, nor is there any kind of suitable replacement to solve its inherent problems.

Trying to emulate mod_gunzip with Apache 2 Filters

The situation: I have gzipped content stored on an Apache 2 web server. Specifically, HTML files–they are stored in this manner to save disk space. For clients that can handle on-the-fly decompression of such files, I want the files to be sent verbatim; for clients that cannot, I want the content decompressed and sent to these clients.

mod_gunzip by Helge Oldach is an Apache 1 module made for dealing with stored gzip files. It can negotiate with a client what kind of encoding it can accept, and send the appropriate compressed or non-compressed version. Unfortunetely, at this time, this module is only available for Apache 1.

Helge Oldach notes that it should be possible to create the equivalent mod_gunzip functionality using only Apache 2 filters. To an extent, yes. I’ve done so:

ExtFilterDefine gunzip mode=output cmd="/bin/gunzip"

<Files *.gz>
  SetOutputFilter gunzip
</Files>

This won’t do the sophisticated (well, at least more sophisticated than the Apache 2 runtime configuration directives will allow) negotiation that mod_gunzip can do, watching for certain clients and combinations of headers.

So, I’m stuck. I’ve a project I had been working on for school that involves HTML reports, collectively, that can be as large as 1.3 GB. Compressing each file with gzip decreases the collective size down to 300 MB, while still allowing the files to be viewed in most modern web browsers (apparently, at the time of this writing, this does NOT include Apple’s Safari (which happens to be used by several of my professors), though Konqueror/KHTML works fine).

Note to self: port mod_gunzip to Apache 2.

Installing Java 2 on Debian, The Debian Way

I can never remember how to install Java on Debian, so here’s my version on how to do it the Debian Way (TM).

Download the Sun Java 2 Runtime environment or Development Kit from Sun’s Java site. The file you download should have a “.bin” extension. Then install:

apt-get install java-package fakeroot

java-package is a set of Debian scripts for creating your own Debian-ized Java package. fakeroot lets you run certain programs as root, such as the Debian package creation process. After these are installed, run:

fakeroot make-jpkg jdk-*.bin
sudo dpkg -i sun-j2sdk*.deb

The first creates a Debian package from the Sun binary installer, while the second installs the created Debian package.

This will fulfill all Java dependencies in Debian, something you would not get if you installed Java via some other method. It’s also the “official” Java, as opposed to using something like Blackdown, and makes you less reliant on having to rely on other people for packaging. For example, I used this to create my own AMD64 64-bit Java package.

Syndicate content