Reinstalling Windows

Wednesday, April 29, 2009

I’ve just completed a fun session of reinstalling Windows XP on to my home machine; I’m a big fan of apps that require no installation and carry all my data and many apps with me. This makes reinstallation much easier. I simply made a brief list of the installed apps on my machine and then (after a backup!) simply ran the install CD for XP. This reinstalls on to an active partition on your machine and simply overwrites the Windows directory and resintalls apps in to “Program Files” (but doesnt delete any files from here). Everything else on the machine remained untouched. I then reinstalled my programs that I regularly use and everything was back up and running relatively quickly.

Except you would never expect anything to be quite so simply and so it turned out. The XP disc had service pack 2 on it, so a 350Mb download later got me service pack 3 which takes a little while to install. I then put on ArcGIS 9.3 (Desktop and Workstation) which takes quite sometime. Add on th service pack 1 and several hours have passed. I then ran ArcGIS to end up in endless dialogs configuring the software. I decided at this point to go back to a system restore point which happened to be before SP3 was installed, so had to go through the whole rigmarole again before I was back at the same point again! So some Googling suggested that a Microsoft XML DLL which ArcGIS uses was the problem; in fact one of the two files (msxml4.dll and msxml4r.dll) was there but in the wrong directory. The other wasn’t (which I simply copied and renamed). Perhaps I’m lucky, but this is the first time I’ve had an install problem, but it does show how frustrating it can be for end users.

Future of the OS?

Monday, April 27, 2009

Well there was much anticipation (e.g. GIScussions) about the budget and it’s implications for the operation of the OS, particularly after all the reports that have been published (Power of Information, UK Location Strategy, Models of Public Sector Information Provision via Trading Funds). To say the outcome was disappointing for many is perhaps an understatement. Some initial reactions from the usual places: Steven Feldman, Ed Parsons, Michael Cross and Charles Arthur. And of course we do have the OSs own WordPress blog outlining their new business strategy.

Charles puts it rather nicely: “Better to bail out banks with tens of billions that you might never get back than spend a few millions stimulating commercial enterprises and encouraging entrepreneurship by giving people access to essential, business-valuable data.” So possibly business as usual; the interesting aspect will be the effect (if any?) on the complexity of licensing and how this effects educational and non-commercial uses. Time will tell.

More Python Training

Saturday, April 25, 2009

After my Python taster at ESRI last November I was left wanting for more. It was only a 2-day course and in that time that had to introduce Python and focus upon the framework that integrates with ArcGIS. Not surprisingly there are many areas left untouched and ESRI training “days” are, well, not quite 9-5. So I booked a 3-day course at the Linux Emporium, a specialist outfit in Sutton Coldfield that focuses on Linux and Python. It was an incredibly reasonable £670 if you go to their base.

The course is an introduction for programmers and introduces Python as a language and its heritage in terms how it sits with other languages. It is mostly quite elegant, easy to pick up, cross-platform and very powerful (object oriented, functional, dynamic, modular). It is interpreted (i.e. it is not compiled) which generally makes for easier development but slower run-time. It has matured over the last 18 years, has a large community base and there are a many extensions that take it in to a variety of areas, as well as a huge resource of user contributed modules. So PyQT covers graphic interfaces and django covers web server integration (and Google is one of the biggest users). Python is written in C but comes in other flavours; Jython )Java) and PyPy (python). There is also a Windows version (Iron Python) which integrates with the .Net framework. From a GIS perspective both ArcGIS and QGIS use Python for scripting, but don’t be fooled in to thinking that its just there to iterate processes for you. It can do an awful lot more.

In terms of the course itself, there is considerable time spent introducing numbers, tuples, strings, lists and dictionaries and how you manipulate. These form the basis for all work you do in Python so understanding them (and running through the examples) is important. Further sessions deal with control flow, file operations and functions. Further areas introduce functional programming, modules and object orientation. The course fee includes support for 12 months and an extra “refresher” day taken within a year. I left knowing a lot more about Python, but also that it was a language worth investing in (for many generic data processing tasks). Highly recommended for all those that need to dip in to scripting.

Palm Pre Update

Monday, April 20, 2009

There’s been a slow trickle of PR concerning the Palm Pre and it increasingly looks like a May release date on Sprint. In general there seems to have been a lot of positive commentary on both the Pre and WebOS. Developers have been brought on board early and the ease and speed of porting software across seems to have won them over. At least initially. We also have Motion Apps announcing a PalmOS emulator for the Pre with some pretty firmed up details on what it can and can’t do. Of course there are still neigh sayers and not without reason. This is a hugely competitive market place in a downward spiraling world economy; Apple, Nokia, RIM and Google (Android) all have very big slices of this pie. However Palm have a heritage of innovation and currently employ a very talented design team. Time will tell!

Two things remain big problems though; firstly there has been no mention of a GSM version and whilst it surely must be coming out, who knows the release date for it. Secondly, on looking over the specs again I note that there is no option to expand the storage on the device. It’s 8Gb and only 8Gb. That is increasingly looking like not much when you have photos and music involved. My Palm TX uses the excellent Palm Powerups PowerSDHC driver allowing me to carry my entire mp3 collection on a 16Gb SDHC card (although that’s getting a little small now). I struggle to understand the rationale behind Palm’s decision here.

Academic programming and research

Saturday, April 18, 2009

Paul Mather wrote an interesting editorial in the latest RSPSoc newsletter (2009, 32, 2) detailing recent comments proposing that photogrammetry is dead as a subject. The basis for this was that well funded computer vision researchers have taken photogrammetric principles and are now moving forward research in this area. It raises the wider issue of whether programming skills are needed by researchers for pursuing their work and Paul notes that “you cannot do real research if your research questions are limited to what a commercial software package will let you do.” He contends that commercial software only allows researchers to use “last year’s techniques.”

Now I don’t disagree that the ability of a package should not determine the work that you do and that you inevitably need some kind of programming or scripting skills to do at least some bespoke work. Indeed, when you need to move in to more complex processing, more generalised enviroments such as MATLAB or IDL are often used. Saying that, much environmental research requires fairly straightforward analysis; just because it’s the latest or newest technique doesn’t mean it’s the best or most appropriate.

Where I have a much greater bone of contention is the quality of the programming. I did some work a number of years ago and performed a PCA (in Imagine) on a number of images. The result was interesting, but I had to revisit the dataset about a year later and re-do the analysis (using a newer version of Imagine). Much to my surprise the results were different. The raised the horrible prospect of algorithm modifications producing different results. And this is perhaps the biggest problem with commercial software: they are a black-box and you have no real way of knowing how good the results actually are. Microsoft Excel has long been hammered for producing inconsistent or incorrect results, yet it is routinely used for much academic statistical work (and heck you can even play on the flight simulator!). SPSS is generally much better regarded, but again it’s algorithms are largely unknown. Empirical testing is required in order to ascertain the quality of the output. Many of the routines in ArcGIS Workstation are better, being well documented and generally taken from academic research (although many remain 20 years old; fine if they do the job, but not so good if there are much better alternatives). A excellent example of how work should be progressed is in the R Project. Here we have open source statistical software with routines written in a generalised scripting environment. Routines are often submitted to statistical journals by researchers in the area and peer reviewed. That doesn’t make them faultless but gives you a far better chance of producing work with correct results.

Paul proposes the development of relevant code libraries for remote sensing, partly because so much C and C++ code is now open source. Whilst this sounds good in principle much software has suffered from the “bolt-on” approach where you take what you have and simply add to it. Blackboard is an appalling bolt-on product, along with the behemoth that Natscape Navigator became. Just because code is open source doesn’t mean it is any good or, indeed, correct. And developing graphical environments is much more time consuming to do. But as researchers we do want to develop code to analysis or process data. So what is the best solution?? Well there probably isn’t one, but issues to consider include the speed of development, availability of existing code, cost, code speed, quality of algorithms, archivability and long term usability.

A nice example of considering these issues comes from NERC FSF. They supply some Excel 2003 templates to process field spectra; they work very well and are good. Of course with the release of Excel 2007 they stopped working and required re-development. They are also proprietary and there is an inherent cost in terms of Excel licensing. In stark contrast, the Latex base system was frozen in (I think) 1989. The software has been improved through plugins, but what it means is that typesetting code written in 1989 will work in 2009. That is a fantastic achievement, particularly in comparison to Excel VBA scripts where 2 years is about all you’re guaranteed. FSF have updated their templates but are also working on some MATLAB scripts which strikes me as a better solution. R strikes me as a good environment because of the peer-review and generalised scripting, but it doesn’t yet offer the richness of image processing that MATLAB or, in particular, IDL do. I’m far less familiar with either MATLAB or IDL, but do they offer the best option for an image processing code library? Or should we be pushing for an image processing environment along the lines of R. I know there are a variety of open source projects running, but am not aware of their status in this respect. Such a code library would of course be an excellent resource and save research projects from reprogramming the same thing over and over.