Philipp Lenssen realized that the browser on his Nokia 6600 was a powerful tool for accessing the web, but, to reduce his data traffic charges and get just the information he wanted, Philipp used his own web site to feed RSS content to his Nokia 6600. In this article Philipp explains how he went about it.
Introduction
With my new Nokia 6600 and its built-in browser, I can view the World Wide Web, and not just the awkwardly small WAP 1.0 portion. WAP 2.0, the newer implementation of the Wireless Application Protocol, replaced WML (the Wireless Markup Language) in favor of XHTML. As you may know, W3C (World Wide Web Consortium) has recently recommended that XHTML be used as the mark-up language for web pages.
The Background
The original idea of HTML inventor Tim Berners-Lee was to have a device independent information channel. Web pages were intended to separate content (HTML, the Web's lingua franca), layout (CSS, Cascading StyleSheets) and functionality (JavaScript, a Netscape-invention now standardized as ECMAScript). Using this separation of content, layout and functionality Web pages should cover, not only the full-sized screen of the desktop PC, but also Braille-readers, print output, Text-to-Speech (TTS)... and mobile phones.
The Document Format
However, just using XHTML 1.0 Strict with a stylesheet for the medium "handheld" is not enough to target all mobile devices. The Series 60 browser requires you to include the XHTML Basic or XHTML Mobile Profile Doctype. Since my pages are XHTML 1.0 Strict already, I found it was not necessary to change the Doctype. Simply using the Basic DTD within an HTML comment (see Example 1) the browser could be tricked into interpreting my XHTML.
Example 1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; CHARSET=UTF-8" />
<title>Cross-Media</title>
<link rel="stylesheet" href="default.css" type="text/css" media="handheld" />
<link rel="stylesheet" href="default.css" type="text/css" media="screen" />
<link rel="stylesheet" href="screen.css" type="text/css" media="screen" />
</head>
<body id="blog">
<!--
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.0//EN"
"http://www.w3.org/TR/xhtml-basic/xhtml-basic10.dtd">
-->
<h1>Cross-Media</h1>
<!-- ... -->
</body>
</html>
The code in Example 1 is served as content-type "text/html" to also work in Internet Explorer, or other, older browsers which don't understand XML/XHTML (HTML4 and previous versions are based on SGML, while XHTML is based on XML -- still, for XHTML 1.0 it's valid to serve the page as "text/html").
The Content
So now I can use HTML and stylesheets on my Nokia. I can use text colors, background colors, background images, borders, floating blocks. It all works fine; but where's the content?
I want recent news. Pictures are too much, as I will pay for traffic (that might be as much as 0,10 Euro per 30KB block with my German T-Mobile provider if I go over a monthly 5MB limit). But even without images, the endless navigation links or other "garbage text" is too much... slow, confusing, costly, and it may also crash the browser.
RSS is the solution. The "Really Simple Syndication" (or "RDF Site Summary"), based on the W3C XML standard, is a meta-file which gets straight to the point. Basically, it's a single file at a constant position telling Web automats: these are my recent headlines. This is their description. And here's the permanent link to the article. RSS is commonly used in personal Weblogs (Blogs) and indicated by a little orange "XML" button, but bigger news sites are also staring to provide RSS feeds.
There are different RSS wrappers available for different languages. I use PHP along with NuSOAP, both running on my Apache server. (You might also use Python, ASP, ASP.NET, JSP, or others.) I now choose some RSS news feeds, like "Yahoo! Top Stories". After reading them, I display a link list as XHTML.
The PHP script looks something like this:
<?
require_once '../magpierss/rss_fetch.inc';
$url = "http://rss.news.yahoo.com/rss/topstories"
$rss = fetch_rss($url);
$items = array_slice($rss->items, 0, $max_items);
echo "<h2>" . $rss->channel['title'] . "</h2>";
foreach ($items as $item)
{
$title = $item[title];
$url = $item[link];
$item_description = $item[description];
// ...
}
?>
Now when the headline catches my interest and I follow the link, the tool will try to grab and deliver the content of the actual page by checking for text between two delimiters. These delimiters need to be adapted on a per-site basis (e.g. I might grab everything between the text of the headline and the string "Copyright by"). Next, I strip all tags, and include some simple structural tags (like "<br />" page breaks). Finally I end up with a fast-loading minimalist version of the original web page, ready to be viewed on my phone (I could make the URL public to be accessed by other people as well, but that might go against a site's copyright restriction).
Some RSS feeds also include the content of the blog post, so you don't need to convert the HTML using the delimiter hack. This approach is preferable as there is one problem with the "screen-scraping" technique: it might stop working if the grabbed page changes its HTML structure.
Conclusion
This seems like a whole lot of work just to browse the Web the way it was intended. Still I'm quite excited to see that it works at all within this limited display space. The more the Symbian OS and handheld browsers gain audience, the more important it will become for webmasters to change their pages to work cross-media. There might come a day when competition won't allow online publishers to think single-media.
About Philipp Lenssen
Philipp Lenssen lives in Germany and is Senior Developer for the websites for a popular sports car. He also writes a daily Google Blog which keeps track of current Google news, and also research into what is done, can be done, and should not be done with Google.
Web: blog.outer-court.com
(c)2004 Philipp Lenssen and SymbianOne.com - All Rights Reserved |