Saturday, 1 May 2010

Steve Jobs on Flash

Steve Jobs has posted his thoughts on Adobe's Flash and why Apple have not allowed Flash to be installed on iPhones and iPads.

Adobe have, probably accidently, developed something like HTMLv5 years (8?) ahead of W3C. They saw the need for a standard, OS-agnostic platform for the development of applications and the presentation of content including video, audio, animation and interactivity.

Adobe's Actionscript was also fast. It seems that all other browsers felt that there was no need to work on Javascript speed because CPU's were getting faster each year - as a consequence of Moore's Law.

But something happened early this decade - CPU speed (clock rates) started to slow and to compensate, manufacturers began to introduce multi-core processors.

Web pages, however, seem to be hard to render using multiple threads and so Javascript performance began to stagnate.

Enter Google. They believe in open standards, an open web and everything running in the browser, sourced from the internet. To make this possible, Javascript needed to be fast so they started the Chrome browser project which incorporated a new and fast V8 Javascript engine. Shortly afterwards, it seemed, Webkit (Apple) and Mozilla began to pickup their Javascript performance as well. And now, we see Microsoft is also working on Javascript performance and standards compliance for IE9.

Adobe have had a good run, but standardisation has caught them up. (In a similar way, standardisation caught Lotus Notes which, for the time it was developed, was - or appeared to be - visionary: Tabbed workspace, forms, separation of data from presentation, security, encryption, signed applications...).

Back to Steve Job's posting. Steve thinks that Flash is closed and the Apple is the exact opposite - meaning open.

Open

First, there’s “Open”.
Adobe’s Flash products are 100% proprietary. They are only available from Adobe, and Adobe has sole authority as to their future enhancement, pricing, etc. While Adobe’s Flash products are widely available, this does not mean they are open, since they are controlled entirely by Adobe and available only from Adobe. By almost any definition, Flash is a closed system.
Apple has many proprietary products too. Though the operating system for the iPhone, iPod and iPad is proprietary, we strongly believe that all standards pertaining to the web should be open. Rather than use Flash, Apple has adopted HTML5, CSS and JavaScript – all open standards. Apple’s mobile devices all ship with high performance, low power implementations of these open standards. HTML5, the new web standard that has been adopted by Apple, Google and many others, lets web developers create advanced graphics, typography, animations and transitions without relying on third party browser plug-ins (like Flash). HTML5 is completely open and controlled by a standards committee, of which Apple is a member.
Apple even creates open standards for the web. For example, Apple began with a small open source project and created WebKit, a complete open-source HTML5 rendering engine that is the heart of the Safari web browser used in all our products. WebKit has been widely adopted. Google uses it for Android’s browser, Palm uses it, Nokia uses it, and RIM (Blackberry) has announced they will use it too. Almost every smartphone web browser other than Microsoft’s uses WebKit. By making its WebKit technology open, Apple has set the standard for mobile web browsers.



What if we take what Steve wrote and swap 'Apple' with 'Adobe' and 'Flash' with 'Mac OS X'? This is what we get:



Apple’s Mac OS X products are 100% proprietary. They are only available from Apple, and Apple has sole authority as to their future enhancement, pricing, etc. While Apple’s Mac OS X products are widely available, this does not mean they are open, since they are controlled entirely by Apple and available only from Apple. By almost any definition, Mac OS X is a closed system. 



It isn't perfect, but it is very close to the truth. Apple use open source, contribute to and develop with open source software, but they produce very proprietary software. You can not run OS X on any other hardware other than Apple hardware. To write well-integrated  OS X applications, you need to use Apple's proprietary interfaces - Carbon or Cocoa. These applications will not run on other OSs (Windows or Linux) and so some developers choose to use frameworks that allow developers to write applications that will run on any OS platform - or they use Java. Steve doesn't like this.

And now, use have to use Apple's APIs directly to write applications for the iPhone and iPad. This has upset Adobe (and probably a number of other organisations that make cross-platform frameworks such as XMLVM).

I think Steve accepts an open web, but everything else should be closed, and Apple is certainly insisting on this path. The iTunes store can only really be used with iTunes and iTunes can only be used with iPods and iPhones and now iPads.

I wonder when iTunes will stop supporting Windows?

iPhoto, iDVD and iMovie only work on OS X too. Keeping your photos on a Mac using Apple's software does tend you lock you in to using Apple hardware and software for a long time.

Full Web

Second, there’s the “full web”.
Adobe has repeatedly said that Apple mobile devices cannot access “the full web” because 75% of video on the web is in Flash. What they don’t say is that almost all this video is also available in a more modern format, H.264, and viewable on iPhones, iPods and iPads. YouTube, with an estimated 40% of the web’s video, shines in an app bundled on all Apple mobile devices, with the iPad offering perhaps the best YouTube discovery and viewing experience ever. Add to this video from Vimeo, Netflix, Facebook, ABC, CBS, CNN, MSNBC, Fox News, ESPN, NPR, Time, The New York Times, The Wall Street Journal, Sports Illustrated, People, National Geographic, and many, many others. iPhone, iPod and iPad users aren’t missing much video.
Another Adobe claim is that Apple devices cannot play Flash games. This is true. Fortunately, there are over 50,000 games and entertainment titles on the App Store, and many of them are free. There are more games and entertainment titles available for iPhone, iPod and iPad than for any other platform in the world.

Adobe claim that by not having Flash, iPhone users are missing out on the full web experience. It is true that any Flash content can not be displayed on the iPhone/iPad, but I think most web sites will develop special versions of their sites specifically for iPhone/Android devices that have limited screen sizes and limited user input interfaces: mice and touch pads offer very fine pointing and clicking controls whereas fingers are a little less accurate and cover-up what you are touching. On-screen keyboards are great but they are no match for a reasonably large physical keyboard.

In time, keyboards may well disappear, but they will be replaced with something that works as good as the real thing, not something that slows you down.

So iPhone and Android users alike are already missing some of the full web experience, but they have the advantage of mobility and newer customised web sites that will only make the experience better.

The existence or lack of H.264 is not really an issue. Flash now supports H.264 and any video will be in a format that iPhone users will be able to view. Interestingly H.264 is proprietary (and Apples has some interest in the patents associated with it) so Steve is not pushing for an open web experience here - he want's royalties and refuses to add the open source Ogg/Theora audio and video formats to Safari to help make the web truly open and free.

Security

Third, there’s reliability, security and performance.
Symantec recently highlighted Flash for having one of the worst security records in 2009. We also know first hand that Flash is the number one reason Macs crash. We have been working with Adobe to fix these problems, but they have persisted for several years now. We don’t want to reduce the reliability and security of our iPhones, iPods and iPads by adding Flash.
In addition, Flash has not performed well on mobile devices. We have routinely asked Adobe to show us Flash performing well on a mobile device, any mobile device, for a few years now. We have never seen it. Adobe publicly said that Flash would ship on a smartphone in early 2009, then the second half of 2009, then the first half of 2010, and now they say the second half of 2010. We think it will eventually ship, but we’re glad we didn’t hold our breath. Who knows how it will perform?

Steve is also worried about Adobe's Flash reliability and security. He has the statistics and claims that Flash is the number one cause of Mac crashes. I wonder what the number two cause is?

So, he say that it is best to keep Flash away from the iPhone and iPad.

Google has taken a different, seemingly more rational approach. They have decided to include Flash into Chrome and have made plans to address reliability and security issues.

Google seeks to eliminate problems rather than add layers to reduce risk. Their Native Client does just this: an architecture to allow any plugin to run so long as it can be validated that it complies to hard rules to that prevent software doing anything harmful. This has to be better than validated compiler tool chains, signed applications, layers of malware filtering and heuristic code analysis.

Battery Life

Fourth, there’s battery life.
To achieve long battery life when playing video, mobile devices must decode the video in hardware; decoding it in software uses too much power. Many of the chips used in modern mobile devices contain a decoder called H.264 – an industry standard that is used in every Blu-ray DVD player and has been adopted by Apple, Google (YouTube), Vimeo, Netflix and many other companies.
Although Flash has recently added support for H.264, the video on almost all Flash websites currently requires an older generation decoder that is not implemented in mobile chips and must be run in software. The difference is striking: on an iPhone, for example, H.264 videos play for up to 10 hours, while videos decoded in software play for less than 5 hours before the battery is fully drained.
When websites re-encode their videos using H.264, they can offer them without using Flash at all. They play perfectly in browsers like Apple’s Safari and Google’s Chrome without any plugins whatsoever, and look great on iPhones, iPods and iPads.

Steve ignores other Flash applications here (which may or may not be kind to battery life) and focuses on H.264 video playback. Again, if Flash now supports H.264 then this is a non-issue (except that web sites would need to re-encode their content which they have to do for the iPhone/iPad anyway).

Touch


Fifth, there’s Touch.
Flash was designed for PCs using mice, not for touch screens using fingers. For example, many Flash websites rely on “rollovers”, which pop up menus or other elements when the mouse arrow hovers over a specific spot. Apple’s revolutionary multi-touch interface doesn’t use a mouse, and there is no concept of a rollover. Most Flash websites will need to be rewritten to support touch-based devices. If developers need to rewrite their Flash websites, why not use modern technologies like HTML5, CSS and JavaScript?
Even if iPhones, iPods and iPads ran Flash, it would not solve the problem that most Flash websites need to be rewritten to support touch-based devices.


Steve says that touch interfaces don't work the same as mice/touchpads and therefore Flash applications wont work anyway. Interestingly, Flash started out as a PenPoint OS which may have had similar behaviour to a touch interface, but I don't know.

What Steve fails to mention is that many web sites also make use of mouseover events to show text and graphics as your mouse pointer hovers over a particular word, link or image. Blogger uses tooltips which help a little in explaining the function of a button. Even Apple's web store for the iPhone uses 'rollovers' to display the help, account and cart menus! I guess Apple had to re-write these sites for the iPhone.

Actually, I just checked - the store is virtually unusable on an iPod touch. You can click on the help menu and a menu will be displayed so you can then double-touch to zoom in. Wouldn't this work for Flash too?

All web sites that need mouseover events to operate will have to be re-written for the iPhone so banning Flash does not fix this - the web site owner needs to do some work to make their sites more accessible for iPhone and iPad users. So why is this an issue Steve? This looks like hypocrisy to me.

The 'real' reason

Sixth, the most important reason.
Besides the fact that Flash is closed and proprietary, has major technical drawbacks, and doesn’t support touch based devices, there is an even more important reason we do not allow Flash on iPhones, iPods and iPads. We have discussed the downsides of using Flash to play video and interactive content from websites, but Adobe also wants developers to adopt Flash to create apps that run on our mobile devices.
We know from painful experience that letting a third party layer of software come between the platform and the developer ultimately results in sub-standard apps and hinders the enhancement and progress of the platform. If developers grow dependent on third party development libraries and tools, they can only take advantage of platform enhancements if and when the third party chooses to adopt the new features. We cannot be at the mercy of a third party deciding if and when they will make our enhancements available to our developers.
This becomes even worse if the third party is supplying a cross platform development tool. The third party may not adopt enhancements from one platform unless they are available on all of their supported platforms. Hence developers only have access to the lowest common denominator set of features. Again, we cannot accept an outcome where developers are blocked from using our innovations and enhancements because they are not available on our competitor’s platforms.
Flash is a cross platform development tool. It is not Adobe’s goal to help developers write the best iPhone, iPod and iPad apps. It is their goal to help developers write cross platform apps. And Adobe has been painfully slow to adopt enhancements to Apple’s platforms. For example, although Mac OS X has been shipping for almost 10 years now, Adobe just adopted it fully (Cocoa) two weeks ago when they shipped CS5. Adobe was the last major third party developer to fully adopt Mac OS X.
Our motivation is simple – we want to provide the most advanced and innovative platform to our developers, and we want them to stand directly on the shoulders of this platform and create the best apps the world has ever seen. We want to continually enhance the platform so developers can create even more amazing, powerful, fun and useful applications. Everyone wins – we sell more devices because we have the best apps, developers reach a wider and wider audience and customer base, and users are continually delighted by the best and broadest selection of apps on any platform.

Steve simply wants to make it hard for any developer to write applications for multiple platforms.

It may be true that frameworks limit the features, but equally it may be true that the Apple iPhone/iPad OS is the lowest common denominator - why do you think that Apple will always have more features that your competitors? Can you merge directories of the same name in OS X Finder yet?

If this is a real reason, then why not specify that any framework must support the whole API? This surely would address your concerns about having all the features available to the developer.

Conclusion


I am not a fan of Flash, but it is generally required for YouTube for PC and laptop users.

I agree the HTMLv5 is the future but I disagree that it should only include a patented and proprietary H.264.

Apple could allow Flash on the iPhone since web sites have to be re-written anyway for iPhone users.

Battery life while playing videos may not be an issue if Flash on the iPhone/iPad used H.264 and Flash had access the Apple's H.264 API.

Security is solvable and Google and Adobe seem to be about to demonstrate this.

Postscript

The person who sent me the link to Steve's posting owns a Mac and an iPod that I know for certain. They are going to say 'bye bye' to Apple based on Steve's compelling argument against proprietary software:
Jobs makes a compelling case for not trusting a proprietary company who has sole control over their proprietary products.
So, bye bye Apple.

I have 3 Mac Book Pros, iPod Nano, iPod Touch, Time Capsule and have been influential in at least the purchase of a Mac Mini over the last 5 years (about $15,000 worth at time of purchase). I am re-considering my use of Apple software and hardware and I will certainly not purchase anywhere near the amount of Apple products in the next 5 years - if any.

I will also be removing the shackles of proprietary Apple software by moving my photo and music collections to Open Source software and online services such as Google Docs. Not just because of this open letter about Flash, but because Apple is removing people's freedom to develop and use software the way they choose to.

Saturday, 17 April 2010

Simple Rolling Hash- Part 3


The Rolling Hash

Now that there is a way to remove the first character and append a new last character (in constant time) to a string hash value, we can also use this hash function suite to perform substring searches in linear time.

To find a substring of length m in a string of length n you might proceed as follows:
Start at the first character of the string
Compare the substring to the string for m characters.
If same, then the substring is found.
else, repeat from the next character of the string (stop when there are less than m characters left in the string)

In practice this works well since it is unlikely that more than a few character will match each round so the effective time efficiency is much less than O(N x M). But for some strings, such as DNA which have small alphabets, the number of matches in each round could be large.

By using a rolling hash this can be reduced to linear-time.

Calculate the hash of the substring we are searching for.
Start with a hash of the first m characters of the string
If the hash values match then the substring is found - probably.
else remove the first character of the string from the hash and add the m+1 th character to the hash.
Now the hash is of m characters from the next character of the string.

This process is O(N).

The catch is that this finds a substring with high probability. To ensure that it is a match we need to check each character.

A Good Hash Function

My next question was 'is this hash good enough?'.

After some research I found that hashing is similar to a problem called Balls and Bins. The idea is to analyse mathematically the result of randomly throwing m balls into n bins.

It turns out that if a hash function is 'good' then it should produce results similar to randomly throwing balls (strings) into bins (string hash numbers).

I am no mathematician so this was slow going and a bit frustrating since I don't know all the tricks used to simplify problems.

I found that for a good hash function, the number of unused hash numbers after generating hash numbers from n random-like strings should be n/e. This is easy to test: I made an array of size n, generated random strings, hashed them, and incremented the element of the array indexed by the hash number.

(To keep the array size small, I first reduced the 28 bit hash number to a 16 bit number by mod 65536).

The probability that a hash number was used once is also n/e. And likewise it is easy to calculate the probabilities of a hash number being used k times:

Pr[ k ] = 1/ek!

And the probability that a hash number is used k or more times is roughly

Pr[ >=k ] ~ ((e/k)**k)/e

(to be completed)

Friday, 16 April 2010

Simple Rolling Hash- Part 2


String Slicing

I also wanted to extract substrings of appended or fixed-length strings quickly as well. 

Since most strings would be less than 64 characters (I reasoned), this too would be virtually constant-time. 

In the case of appended strings, the slicing would be proportional to the number of appended strings which would be roughly O(N) but with a very small constant factor in the order of at least 1/10 th on average.

Again, to ensure that the hash calculation would not harm performance, I needed a way to adjust the hash of a string by the hash of the leading or trailing substrings.

eg. 

hash( "def" ) = some_function( hash ( "abcdef" ) , hash( "abc" ) )
hash( "abc" ) = some_function( hash ( "abcdef" ) , hash( "def" ) )
To do this I needed to choose carefully the value of k such that k * inv(k) = 2**28 + 1, where inv(k) is the multiplicative inverse of k - ie. k and inv(k) are factors of 2**28+1.

Lucky Break

This part is simply a fluke. I needed my hash to be 28 bits so that I can store type information in the other 4 bits of a 32-bit value. 

I could use a hash that is less that 28 bits, but not more. Fortunately, 2**28 +1, which is 268,435,457 has two prime factors: 17 and 15790321.

It is also fortunate that 17 is a reasonable multiplier for a hash as well since, I reasoned, it preserves the low 4 bits of each character which contain the most information for the letters and numbers and punctuation characters.

I have found another hash function that uses k=17.


Removing Characters and Calculating the New Hash in O(1)

To remove a leading character from a string of length n, you need to remove it's component of the hash which has been multiplied by k, n-1 times. So the new hash is

hash( s' ) = hash( s ) - k**(n-1) * c, where c is the first character of the string s

This can be done in constant-time, but removing M characters takes linear-time, O(M).

In the more general case, removing a leading substring of m characters follows the same pattern: here the leading substring has been multiplied by k, n - m times.

hash( s' ) = hash( s ) - k**(n-m) * hash( f ), where f represents the leading m characters of the string s.

To remove the trailing character from a string of length n, you need to subtract it from the hash and then divide the hash of the string by k. Division can be performed by multiplying by the inverse of k which is 15790321.

hash( s' ) = ( hash( s ) - c ) * inv( k ), where c is the last character of the string s.

To remove the trailing m characters from a string of length n, you need to subtract the hash of the m characters and then divide the hash of the string by k**m. Again, division can be performed by multiplying by the inverse of k which is 15790321.

hash( s' ) = ( hash( s ) - hash( t ) ) * inv( k )**m, where t is the last m characters of the string s.

Unfortunaely in the general case, to remove substrings from a string, I need the hash of each substring which takes linear-time.

Reducing Substring Hash Time

Here is where the fixed-length strings help: The only substrings that need to be calculated are the first and last fixed-length strings of a slice. And these substrings are between 1 and 63 characters long. 


eg. Assume the fixed-length strings are 3 characters long and slice a string to return 6 characters from the second.
( "abc" , "def" , "ghi" ) :2:6 -> ( "bc" , "def" , "g" )
In this case, the hash for "bc" and "g" have to be calculated, but the hash for "def" remains the same.

We can always guarantee minimal time by observing that if the substring to be removed is greater than half the length of the fixed-length string then it is better to re-calculate the hash of the resulting substring. This way, worst case is calculating the hash of fixed-length characters, or 64 in my case.

If it happens that the hash of the substrings is known then we get back to constant-time.

Simple Rolling Hash- Part 1



To process strings quickly I decided to try some hashing techniques to quickly append strings and make sub-strings (or slices) in time O(1) - constant time.


I also wanted to be able to search a string for some usually smaller string or word in linear time. This technique is called a rolling hash.




Background

A 'hash' is a number that is generated from some data in such a way that the distribution of the numbers looks random and there is a low probability that the same number is re-used.

For example, a hash of a string could be to multiply the length of the string by the sum of all the characters (this probably isn't a good hash method, but it explains the concept).

eg.

hash(abc) = 3 * 97+98+99 = 588

My Requirement - O(1) String Comparison

To speed-up string comparisons I was looking to use a 28-bit hash.

My aim was to have string comparison performed in roughly constant-time. To do this, I couldn't simply scan both strings looking for a mis-match as this would take O(N) time where N is the length of the two strings (unequal length strings can never match).

Instead I had to use some form of a hash of each string and only scan the strings if the hashes were equal. 

I looked at the DJB and SDBM hash functions for their simplicity. SDBM seems the better one by all accounts, but both would do for my simple purpose.

My Strings

My strings are not simply an array of characters. 

I divide each string into non-mutable fixed-length strings - currently 64 x 8-bit characters or 16/21 Unicode characters.

(My aim is to pick a fixed-length size that is 'just right' most of the time. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.6999&rep=rep1&type=pdf )

This design, I was hoping, would reduce the cost of appending strings by allowing the new string to be formed as a linked list of the two source strings.

So, 

"abc" + "defg" -> ( "abc" , "defg" ) 

As a side benefit, this also allowed for a more straight-forward memory management and garbage collection scheme - if all memory allocations are the same size, then they can be easily and cheaply recycled, and memory is much less likely to become fragmented.

Although this would be fast - constant-time O(1) - I would have to re-calculate the hash for the new string which would result in linear-time O(N) behaviour and therefore negate any real performance gains: appending strings would still be O(N).

What I needed was a hash that could be generated from the individual hashes of the two source strings. 

With a little modulo arithmetic I was able to determine that if the hash functions were of the form:

hash(n) = k * hash(n-1) + char(n), and h(0)=0

then the resulting string would have a hash of

hash( s1+s2 ) = k ** length( s2 ) * hash( s1 ) + hash( s2 )

So now, I could append two strings and calculate the hash of the new string in constant-time.

Continued in Part 2.

Saturday, 10 April 2010

Apple's new iPhone license agreement

ars technica covers it well


http://arstechnica.com/apple/news/2010/04/apple-takes-aim-at-adobe-or-android.ars?utm_source=rss&utm_medium=rss&utm_campaign=rss


Apple's new API license includes the following:



The new version of 3.3.1 reads:
3.3.1 — Applications may only use Documented APIs in the manner prescribed by Apple and must not use or call any private APIs. Applications must be originally written in Objective-C, C, C++, or JavaScript as executed by the iPhone OS WebKit engine, and only code written in C, C++, and Objective-C may compile and directly link against the Documented APIs (e.g., Applications that link to Documented APIs through an intermediary translation or compatibility layer or tool are prohibited).
This seems to mean that developers can not use tools that translate programs from one platform to the iPhone, or frameworks that try to make all smart-phones look-alike.

This is clearly a closed, dictatorial stance.

There are already many iPhone apps and games that are openly built this way - what do they do?


But how can they test that a code-generator/cross compiler was used?

Perhaps this will spawn a new breed of code-generators that actually translate code from one language to another with same variable names, comments, parameters, etc.

Now that would be cool.

Monday, 29 March 2010

Solar Panel Update

Yesterday was the first anniversary of our photo-voltaic solar panel installation.



Generated

We have generated 1951 kWh of electrical energy.

This works out to be 5.34 kWh per day on average. A lot less than the 7.5 kWh per day that we were told to expect. At 7.5 kWh we should have generated 2737 kWh over the year.

They base this on Solar Irradiation Maps such as these. Sydney, for example, should receive 5.5 equivalent peak sun hours (PSH). Our 9 panels have an effective area of about 10 square metres. At 15% efficiency, we should generate 1.5 kW during full sun. At 5.5 PSH we should be getting 8.25 kWh per day on average. This roughly agrees with the companies estimate of 7.5 kWh per day. Perhaps something is wrong.

This variation is disappointing. It works out to be 71% of what should be possible. Now we have had a very wet summer so that would limit our generation capacity.

Our average daily consumption is about 6.23 kWh for the last year - 2274 kWh. The wet summer didn't help here either: we had to use an old clothes dryer on several days.

The system is therefore supplying about 85% of our needs. On the up side, we are now being paid 60c for each kWh generated and it costs us about 20c for each kWh consumed.

Electricity earns us $2.00 per day - excluding service charges.

Gas

To heat our water we use natural gas. We consume about 27 MJ per day on average. This is equivalent to 7.5 kWh per day, so our total energy consumption is about 14 kWh per day.

Gas costs us 50c per day - excluding service charges. This is about 7c per kWh.

Heating

We have a reasonably efficient wood fire for winter heating. I can only guess that we would use 10 to 20 kg of wood for about 100 days each year.  At 15MJ/kg we consume about 40-80 kWh each day for these 100 days or about 11 to 23 kWh per day on average.

Ignoring the capital cost and fuel for the occasional chain-saw use, the wood costs us nothing.

LPG

Our car runs on LPG. We consume about 8L per day. At 27.8 MJ/L this is a massive 62 kWh per day - just to run a car.

This is nearly double what we use in our house.

LPG costs us about $5.60 per day or 9c per kWh.

Total

In total, we generate about 5 kWh per day and consume 75 kWh per day (I have excluded the wood since we only plan to use wood that would otherwise have been chipped at the tip).

We have a long way to go before we are sustainable. The car is by far the biggest problem.

In terms of cost per kWh, electricity is double the cost of LPG and triple the cost of natural gas.

Thursday, 11 March 2010

Directory to XML BASH Script

Ever wanted to get a directory into an XML structure?

Here is a quick, short and easily modifiable BASH script that works well.

To see how it works, you could run it like this:

scriptname dir | xmllint --format - | less

for dir, you can use . .. ./ ../ or / as well as subdirectories and full directory paths.

You might also be interested in this project: xml-dir-listing

How it works

The script first makes special directory specifiers easily useable. It then calls doDir with the directory.

doDir uses ls to get all files in the current directory. If the file is actually a directory and not a sym-link, it calls itself to process the subdirectory. Otherwise it outputs the file name.

Certain special directories (. and ..) are ignored to avoid infinite loops.

The program can use a lot of stack space so I increase it - I just guessed a value. It also can take a long time, so I renice the process so you can do other things.

To stop it, you may need to enter a lot (10-20) of ctrl-c's. I'm not sure why.

Sample Output


<dir>
<dirname><![CDATA[/usr/share/doc/distcc/example]]></dirname>
<file><![CDATA[init]]></file>
<file><![CDATA[init-suse]]></file>
<file><![CDATA[logrotate]]></file>
<file><![CDATA[xinetd]]></file>
</dir>
<file><![CDATA[protocol-1.txt]]></file>
<file><![CDATA[protocol-2.txt]]></file>
<file><![CDATA[reporting-bugs.txt]]></file>
<file><![CDATA[status-1.txt]]></file>
<file><![CDATA[survey.txt]]></file>
</dir>
<dir>
<dirname><![CDATA[/usr/share/doc/groff]]></dirname>
</dir>
<dir>
<dirname><![CDATA[/usr/share/emacs]]></dirname>
<dir>
<dirname><![CDATA[/usr/share/emacs/22.1]]></dirname>
<dir>
<dirname><![CDATA[/usr/share/emacs/22.1/etc]]></dirname>
</dir>
<dir>
<dirname><![CDATA[/usr/share/emacs/site-lisp]]></dirname>
</dir>
<dir>
<dirname><![CDATA[/usr/share/enscript]]></dirname>
<file><![CDATA[88591.enc]]></file>
<file><![CDATA[885910.enc]]></file>
<file><![CDATA[88592.enc]]></file>
</dir>



The Script
#!/bin/bash


# WARNING: To break this, you need to enter a lot of ctrl-c's


# heavy recursion so allow a bigger stack
ulimit -s 32768


# run with low priority so you can do other stuff while it works
renice -n +19 -p $$


function doDir {
  # directory name may contain illegal XML characters so we won't use attributes 
  #echo "<dir name=\"${1}\">"
  echo "<dir>"
  echo "<dirname><![CDATA[${1}]]></dirname>"
  # get all files and directories
  ls -Ab1 "$1/" | while read file; do
  # recursively process directories but not sym-links
  if [ -d "${1}/${file}" ] && [ ! -h "${1}/${file}" ]; then
    # don't do . and .. either
    if [ "$file" != "." ] && [ "$file" != ".." ]; then
      doDir "${1}/${file}"
    fi
  else
    # output the file
    echo "<file><![CDATA[$file]]></file>"
  fi
  done
  echo "</dir>"
}


# normalise initial directories so they all work
DIR=$1
if [ "."   == "$DIR" ]; then DIR="$(pwd)" ; fi
if [ ".."  == "$DIR" ]; then DIR=".."     ; fi
if [ "../" == "$DIR" ]; then DIR=".."     ; fi
if [ "./"  == "$DIR" ]; then DIR="$(pwd)" ; fi
if [ "/"   == "$DIR" ]; then DIR=""       ; fi


doDir $DIR

Tuesday, 23 February 2010

UPDATE: HP C7280 Ink System Failure


UPDATE: We had another paper jam recently and it caused what seemed to be this same fault. Moving gear back to it's correct position solved problem.

I had this problem today.

Ink System Failure... 0xc05d0381 (I did get another code after resetting the printer as well.)

It may have started two days ago when our first piece of paper jambed.

Note to self: when removing jammed paper, reassemble the pieces to see if you got it all out.

It worked fine today, but tonight it started making horrid gear-grinding noises. Then I got the 'Ink System Failure' message.

I tried a number of suggested remedies (see the 'fixyourownprinter' forum) but all they succeeded in doing was to factory default my printer - now I had to re-enter that very long wireless router SSID code again.

But it is such a pleasure with the HP Setup UI (did any HP tester actually use it? The input field doesn't even show all the characters so the last 3 are entered blind!)

After several power cycling events, I decided to investigate. I noticed how the ink pumps on the print head seemed to work - a black arm attached to a white gear is allowed to rotate about 1/2 a turn. The black arm is connected to a metal shaft which rocks a spring-steel sheet that depress some rubber boots. Like you might have on a lawn mower.

The printer head can move all the way to the left. When it does, the gears engage with some other gears to drive this pump.

In the picture you can see 2 white gears in the middle of the picture. The left one can rotate around the larger one and can rest in two possible positions.

The position shown in the photo seems to be the correct position. The incorrect position is to the right of the larger gear. My little white gear was in the incorrect position so I moved it.

Now my printer complained that I had a jam. And indeed, I still had a small piece of paper stuck in the area where the head parks itself.

I removed it with chop-sticks and all was well. The printer cycled ink through the cartridges and was happy again.

Friday, 19 February 2010

Buying new Tyres? What do Tyre Markings Mean?



Buying new tyres?

  1. If you live in Australia, spend $10 or go to a library and get the Choice test for tyres that your car uses (or something close).
  2. Look at the recommended tyres and decide what features you want more than others. 
In my case, I want best wet weather cornering and braking, followed by dry weather cornering and braking - then price.


I need 205/65R15, 215/60R15, or 225/60R15 tyres. I have 7" wide rims (15x7J) so I can use 8" to 9" tyres (205 - 225 and maybe 235).


You can change widths and profiles if your rims are the right width and you consider speedo errors.


For example  a 205/65R15 has the same circumference as these:


  • 215/60R15 (1.3% error),
  • 225/60R15 (0.5% error), and 
  • 235/55R15 (1.2% error). 


Here is my Math.


Circumference = pi * ( rim_size * 25.4 + 2 * ( profile / 100 ) * width )


A worked example might help.

For 225/60R15, C = pi * (15*25.4 + 2*60/100*225) = 2045mm

What to look for.

  • Tread Wear 300+ (3 time 'standard' tyre life) 
  • Traction (braking in the wet) AA or A 
  • Temperature A 
Other Markings

The tyre manufacture date is also on tyres. It seems to be stamped in rather than the other text which is raised. Look for something like 2309 - meaning 23rd week in 2009.



Here are photos of the Treadwear, Traction, Temperature, tyre size and date from one of my old tyres.




Treadwear 300
Traction A
Temperature A


Tread width 225mm
Profile 60% (of 225mm)
Tyre Radial (R)
Rim 15"
96V load and speed rating


Manufactured in the 49th week of 2007

Car Tubeless Tyre Repair







I have just 'repaired' my first car tyre.

We ran over a No. 2. No, not that sort, a real No. 2... The sort that is fixed with 1" nails to fence posts... Only it wasn't fixed to the post anymore.

Both nails punctured the tyre.

A neighbour once showed me how to repair a Bob-Cat tyre so I thought I would have a go. It only had to last a week until I put a new set of tyres on the car anyway.

I bought a tubeless tyre repair kit ($15) and one repaired hole is good and the other is a wait-and-see repair.