CNBC, take a lesson from the March Hare: Announce what you’ll be broadcasting, and broadcast what you announce

When I got satellite TV last year from Shaw, one of the channels to which I got access in the package to which I subscribe is CNBC. CNBC has some very interesting programming and documentaries. Some remind me of why I got hooked on cable channels over the traditional network channels in the early nineties, and are comparable to series such as the (excellent) Bill Kurtis’ American Justice on A&E and take your pick of any the mainstays on The Discovery Channel like Frontiers of Construction, Mega Ships, Ultimate Engineering, and the like.

One of the features of Shaw Satellite is that there is a “Guide” screen that shows the current programming as well as future programming; so far — for the purposes of this entry, in fact — the furthest into the future I have ever checked is a touch over 24 hours.

What I am wondering is how much CNBC actually cares to accurately report their schedule in advance to the cable carriers who arguably are their bread and butter, put aside of course their advertisers. I am obviously aware that I am perhaps at best on the fringes of CNBC’s prime target audience, since I’m not a stock broker or a financial analyst.

Of course a small part of me is wondering about how much effort Shaw puts into requiring accurate reporting of scheduling from the stations that they broadcast, at least to the extent that they have power over such matters and wish to try to exert such power over the stations they broadcast. I am aware that Shaw is but one player in the Canadian market, and that CNBC is a player in the much larger Amercian market.

But in researching this post I have come to the conclusion that Shaw probably has little to do with the issue at hand.

Over the last few months, I have occasionally wandered over to CNBC because of a scheduled show that sounds interesting, usually as a result of channel surfing. And what do I find but a substitution program, however interesting it may be. I can count at least four such times prior to this past weekend that the actual program is different from the announced program. This phenomenon caught my attention again this past Thursday, April 01 2010, after 10:00pm when I “wandered” over to CNBC, interested by the announcement of “Enron: The Smartest Guys in the Room”; what was in fact being broadcast was a one hour documentary called “Ultimate Fighting — Fistful of Dollars”. Note that from now on I am referring to times in Eastern Daylight Time.

My conclusion was a presumption that the occurrence was no doubt a semi-regular occurrence given the sporadic nature of my noticing this at random intervals over time; the perceived regular occurrences seemed a bit beyond the usual occasional errors due to technical difficulties such as:

– transcription errors due to a clerk for either party entering the wrong information into the schedule;
– the station expecting to secure the appropriate broadcast rights in time to a given show or episode but not managing to, hence the switch;
– the station losing a last-minute scuffle with the show’s copyright holder(s) and having to show something else as a result;
– the station realizing, after sending off the planned schedule, that there was an accounting error and they had used up their broadcast rights for the show or particular episode, or neglected to renew it;
– a major news event making a last-minute substitution appropriate, such as a market crash or the sudden arrest of the CEO of a major multinational;
– etc.

In this case I’m usually discussing a repeat of, say, their Wal-Mart documentary instead of their Marijuana production in the US documentary, both being otherwise very interesting documentaries, but neither being of particular greater value or time-sensitive interest over the other. However in a couple of cases live market shows have been actually shown, which I am aware are CNBC staples.

So for the past few days I have been, without going much out of my way, noticing the schedule versus the actual show being broadcast on CNBC. (I of course won’t bother mentioning the times that I checked and the schedule was accurate, or could have but didn’t check.)

Saturday April 03 2010 at 11:30am, the schedule announced the half-hour “Sexy Beach Bodies” (I’ll presume it was a show about Beach Culture, Suntan Lotion, and possibly Skin Cancer. Or an infomercial on how to get ripped in 28 days or less 🙂 ) The actual show was “Cruise Inc.”, a documentary about the cruise industry, which, being an hour-long documentary, began at 11:00am.

Saturday April 03 2010 at 10:00pm and 10:30pm, the schedule said “Till Debt Do Us Part”, a Canadian personal finance reality TV series about various couples’ train wreck spending habits heading straight for more debt after bad, and how to get out of it. The actual show was a one hour documentary called “Game Changers” about innovative entrepreneurs who were very influential in their field.

As I was writing this post on Sunday afternoon, I checked CNBC again, for my amusement and to add to the case either way:

Sunday April 04 2010 at 5:00pm was announced as “Paid Programming”, while the announced show for 5:30pm was “Relieve Back Pain”. The actual show from 5:00pm to 6:00pm was a one hour documentary called “The Money Chase: Inside Harvard Business School”.

As I was editing this post on Monday evening at about 9:45pm, I happened upon CNBC and decided to amuse myself again:

Monday April 05 beginning before 9:30pm and ending at 10:30pm, the schedule said “Enron: The Smartest Guys in the Room” followed by “Till Debt Do Us Part” at 10:30pm. “Squawk Box” was at 9:45pm. At 10:00pm, “Cash Flow” came on.

Now in the defense of either party, strictly speaking — and I mean, during the same period I only noticed the following example in the course of my TV consumption, and at the same time do not recall in the past several months noticing another such occurrence outside of CNBC — the phenomenon isn’t just limited to CNBC: On Saturday, April 03, 2010 at 9:00pm, I was watching KCTS PBS Seattle and was watching “As Time Goes By” (Yes, I am a boring person with nothing better to do on a Saturday evening!) The schedule was accurate about the show and time, but the episode description was off, so it’s not strictly speaking limited to CNBC. However for the moment I’m willing to classify this occurrence within one of the excuses listed above, and in any case I don’t recall observing the phenomenon for other stations, which generally have been accurate within my experience.

But here’s the clincher: At about 8:30pm on Saturday, April 03 2010, CNBC had a commercial announcing “Enron: The Smartest Guys in the Room” for Sunday, April 04 at 9:00pm to 11:00pm, so at that time I checked the Shaw schedule for Sunday evening, and sure enough the schedule listed the Enron documentary. At about 9:57pm on Sunday, April 04, finishing another show on another channel, I checked CNBC: It seemed as though “Squawk Box” was finishing, although I held judgement for five minutes, never having watched said show and figuring that maybe it could be a business update during an end-of-the-hour commercial break. Funny enough, at 10:00pm, they had a show called “The Run Down”, a live news show providing daily reporting on Asian markets, which are twelve to fourteen hours ahead of the New York Markets. Despite the currency of the live information, it seems to me that such a show would not trump the former in a last-minute showdown, certainly not from what they seemed to be showing in the first few minutes. However, to look at the opposite side of the same coin, this is the kind of show that, being another component of the station’s bread and butter, would be a matter of “Well duhhh, we always have those shows on, why is the programming guy announcing in the Enron documentary?” making me wonder why the Enron documentary would be listed at that time; but no matter, whether CNBC intended to show one then quickly changed its mind, or didn’t bother to check whether the “listings announcement” guy is doing their job properly, the problem is the same: There seems to be a rampant problem with the concordance between the Announced Broadcast Schedule, and the Actual Broadcast Schedule.

Come On, CNBC, what’s up with your programming? You’ve proven that at least not all the inconsistencies belong to Shaw’s clerks taking too many coffee breaks! In fact, it seems to be that were it not so blatantly obviously due to a case of laissez-faire on your part, I dare say there could be a bit of “Bait and Switch” going on.

So my message for you CNBC, is a paraphrase of the March Hare’s message to Alice: “Announce what you’ll be broadcasting, and broadcast what you announce.”

The Enron documentary schedule as announced on the CNBC website on April 06, 2010 at about 8:00pm EST

Google Maps seems to need to learn that some streets go East AND West

I think that Google Maps is overlooking a basic function: In the real world, people sometimes go east, and sometimes go west.

Yesterday for the third time in a couple of years I relied upon Google Maps for directions and was sent to the wrong place. Caveat Emptor strikes again.

In Montreal, east-west streets which bisect St. Laurent Boulevard (which, no surprise, goes sort of north-south), start their numbering in both east and west directions from there. Hence you can have two equally valid addresses on a given street, given the proviso that one is designated as “East” and the other “West”. (Hey! It’s Captain Obvious!)

Fortunately, the address I was looking for was 151; during an hour of going around the neighbourhood looking for parking around “151 Laurier” (East as proposed by Google Maps), I found out that that address wasn’t a dépanneur that sells a huge variety of microbrewery beers, and looked like it never was, and finally decided to go further down the street looking for similar businesses. I suddenly had a V-8 moment and realized “Ooops what about 151 Laurier WEST?” I high-tailed it in the opposite direction and found the business in question. And to my disappointment, they were out of the particular beer I was seeking — Weizenbock, by La Brasserie Les Trois Mousquetaires, which has replaced my previous definition of ambrosia, Trois Pistoles by Unibroue.

Twice before I have had similar experiences:

About a year ago, while in Western Canada in completely unfamiliar territory on a business trip, I had looked up a client’s address, and not knowing about any local east/west splits that addresses on the Trans-Canada Highway may have in that locality, I tried to find the address, on the east end of town, that Google Maps had provided; I was about 45 minutes late by the time I finally managed to suspect that my client’s address was a “West” address and got there.

And just to quash any participant in the Peanut Gallery out there about to say “Aha well when using Google Maps you should know that in such cases they’ll always send you to the East address, so be sure to always check both!” a couple of years ago I had looked up a local address for client, and Google sent me to Gouin Boulevard West here in Montreal, a solid 45 minute drive away from my client’s Gouin Boulevard East address.

Now the Peanut Gallery may have a point: In the real world, people sometimes go east, and sometimes go west. And when it comes to using a free online service, you get what you paid for. As such, when looking up an address on any online service, one should notice “Hmmm this is an east-west street which may bisect such and such a street and as such have East addresses and West addresses; I should specify both east and west in my address search.”

But I wonder how many other people place enough faith in Google that under such circumstances — such as when they don’t know that there’s an East and West of a given street — they would reasonably expect in the case that a street has valid East addresses and valid West addresses (and likewise for North and South addresses) that Google’s response page would come back with “Did you mean (A) 151 Laurier East, or did you mean (B) 151 Laurier West?” Certainly Google seems good enough at asking such a question when you slightly misspell a street or city name, or decides that it doesn’t recognize the address you supply and provide you with half a dozen options, as often spread across the country as spread across the city.

Ubuntu and Fedora LiveCDs — Ubuntu a clear winner!

I’m trying to convince a certain group to wipe their virus infected (and no doubt with trojan horses, key loggers, and spyware) computer over to linux, and so I’ve burned the Fedora 12 Live CD and the Ubuntu 9.10 Live CD.

I don’t want to bother giving them the Fedora Live CD. The Ubuntu CD is far too slick. And, the Fedora Live CD is far too vanilla. And that’s despite my usual rivalry with Ubuntu; at first glance, the killer is the inclusion of OpenOffice.org on the Ubuntu CD, while Fedora has the lightweight (albeit otherwise capable) AbiWord. Even the brown looks bright and welcoming, as opposed to Fedora’s more conservative, dull greyish-blue.

Add to that the directory of various files introducing Ubuntu, what it’s about, and even a sample mortgage calculator, and it’s little wonder that Ubuntu gets a whole lot of first timers straight out of the gate, or that first timers settle on Ubuntu after trying a bunch of other distros. As a marketing tool (at least for the desktop), the Ubuntu CD wins hands down; I’m not even sure that fully set up via traditional means from the DVD or full set of CD’s Fedora is this flashy.

I’ve been telling people for a while that “I use Fedora, but you’ll find Ubuntu easier”. I’ve just seen the proof. Seeing the CD, I would want to start afresh with it. I won’t of course, but I was impressed.

I’m wondering, though, which is the real killer — the inclusion of OpenOffice.org, or the directory introducing Ubuntu? I bet that were Fedora to mount a similar directory, including how to expand upon the base supplied on the CD, that people might take it up a bit more. I’m thinking of things like “Accustomed to OpenOffice.org? Go here and this is what you do.” or a “top five” “what to do once you install the Fedora base (or even just the Live-CD)” based on “Common desktop tasks”, “Setting up a home file and media server”, or the usual choices found in the standard anaconda setup.

I’m even thinking that the Ubuntu Live CD is productive — and “complete” — right away with its little directory, forget having little tutorials.

I guess that I should find out about whether or not Fedora does something like this, though … 🙂

Using apps to do a “Pothole and Poop Patrol”

Setting the stage: Last June I was blown away with an insurance company’s commercial for an IPhone/Smartphone App letting people properly document a car accident in order to help simplify the claims process.

Looking through the newspaper this morning, I noticed yet another example of an otherwise mundane app for smartphones: Apparently, a bunch of American (and presumably other) cities have apps which allow local citizens to collect data including photos and location (usually by but not always gps coordinates) of potholes, and using 3G/Wi-Fi hotspots to report potholes directly to the local public works, saving money by bypassing presumably more expensive operators, field inspectors and the like, as well as saving money by directing workers directly to where work is needed instead of waiting around for the information to trickle through the system. And, essentially, putting crowd-sourcing, or the notion of “many eyes will eventually reveal bugs” to work.

Many such apps also are more general and allow people to report all sorts of things beyond potholes, such as broken lamp standards, water main breaks, and the like.

Beyond being impressed, it made me think back to 2001-2002 when I’d just gotten a gps and started playing geocaching: One of the funny stories that came about in geocaching circles (and no doubt general gps circles) when people were learning about the uses of gps with 3m-8m accuracy involved some groups of people essentially going out on “Poop Patrol”, marking the locations of where they found piles of poop left by — no, I won’t indulge in the joke that just came into my mind and perhaps yours — ok, here it is, poople, who don’t clean up after their dogs. We thought “What, don’t people have better things to do with their lives than go around looking for piles of poop and filling up their gps memory with their locations? What are they going to do with the information and all the waypoints? Chase down and tackle the offenders? What about the local council meetings that will no doubt have people being laughed at during question period when they bring their lists of waypoints?”

Funny, mulling over the “Pothole Patrol” I read about in the paper this morning, the “Poop Patrol” seemed less amusing in the ridiculous sense and more viable as a way of measuring hot spots for increased street cleaning, or identifying dog walking hot spots where perhaps municipalities might consider adding dog runs where they might not have without the “Poop Patrol” data, or adding or reinforcing secondary services such installing bag distributors for dog walkers who forgot their bags to “poop & scoop” mounted on lampposts (and of course filled by conscientious dog-walkers who can bring their excess supplies of plastic bags) as can be found in many dog run parks, or add extra garbage bins in those areas.

Again, along those lines, I’m thinking about geocaching.com, which facilitates “Benchmark Hunting” (in its most basic form, taking the coordinates of USGS benchmarks and going out to hunt them for the pleasure of it, and then logging the finds as well as the adventures along the way on the website.) I bet the USGS takes advantage of the informal “inspections” in some way.

Or how, in about 2001-2002, again when I was starting off with geocaching, I’d registered for an account with Natural Resources Canada to search out Gravimetric Markers, essentially the same as geographic benchmarks but whose purpose and location are related to standard measurements for gravity; I live near one, and there’s one near my cottage, so the idea of doing volunteer inspections along the lines of doing simple check lists seemed like a fun complementary activity to geocaching seemed like fun. Such checklists could contain, say, 5 items on the physical integrity, access, and so on related to the marker which any person off the street could perform on a regular, semi-regular, or sporadic basis and the results of which could be useful to the maintainers so that the responsible body could channel resources to “more important” activities as well as proritize maintenance schedules, as above.

At the same time, I also “noticed” all the Bell Canada telephone switching boxes along the way to the cottage in the same light and thought about doing volunteer inspections, which I never pursued.

Unfortunately home networking issues at the time made accessing my account with Natural Resources Canada difficult and the charm of both ideas fizzled out.

But now, the ideas from “The Poop Patrol” to volunteer inspections of Gravimetric Markers and Bell Canada switching boxes, in the light of “The Pothole Patrol” and taking into account human idiosyncracies and the human penchant for such trivial pastimes, seem less silly …

Handheld computer Apps

Here’s a blog entry I meant to post last June, 2009, but for whatever reason I never got around to it. I’m doing it now to set the scene for my next entry. 🙂

*****

I just saw a commercial for Nationwide Insurance. And I was blown away.

They have this several times mentioned “accident app” for the unnamed but clearly identifiable iPhone / iPod Touch. It brings you through the process of what you need to do if you have a car accident — take photos of the damage, the area, the other car, the street, where the cars are relative to everything, then the address, the license of the insured car, your details, the other car’s and driver’s details … of course all this just from the commercial.

And I’m thinking … that’s the kind of thing a handheld with an integrated camera is for, not just taking pictures of Fluffy, Rover or the kids every five minutes and sending them to friends, or playing some inane game. I was thinking that the Apple App Store, without having checked it out myself, was probably full of useless apps like tip calculators and calorie counters.

I used to have a Palm Pilot. I still have it, in fact, but I don’t use it. You looked all around the net for little apps, and there’d be plenty of useless ones, and you get tired when the one or two useful apps just don’t cut it anymore and in the process required so much effort to find what was (usually) a second-rate app.

Or maybe I am glimpsing at why Ubuntu is doing so well.

I think I’ve been so high on my horse about open source that I’ve missed something.

Or, maybe one of the open source drawbacks is that there isn’t an open-source Linux “App Store” out there, creating the buzz that “you need this, it’s a killer app” or “it’s a killer appliance” creating the desire for the product (and then of course the apps would follow) or whatever (March 2010: Hello Android!)

(addition in March, 2010) Now of course this of course is a really bad observation on a technical level; the Apple “App Store” is a kind of repository for the IPhone and IPod Touch, and there are plenty of linux repositories; and, despite the completely different paradigms between a desktop (and even laptop and netbook) and a handheld device, of course there are plenty of little programs that would be apps for your computer available. Despite its convenience, I wouldn’t want to use my Acer notebook to fill out all the details of my latest car crash, even though it has a web cam in it and wi-fi, and I suppose I could get a 3G dongle for it. My meaning was more along the lines of when I was responding to a survey about a tax program, which asked why I used the version I used, in my case, the web version: I said I used it because I use linux and they don’t have a linux version, and that in order to have a linux version, the best way would be to push it through the repositories or have their own repository, in order to maintain bug fixes and updates in tax laws, meaning that the afore-mentioned buzz is no longer there surrounding a computer (or more) in every house, and the fact is what makes handheld devices so buzz-worthy is the combination of small convenient size and processing power. See next.

(back to June, 2009) My brother, who has an iPod touch, says the big difference between handheld computers today and those of five and ten years ago — aside memory and processing power and the like — is the presence of wifi abilities and hotspots; the inclusion of a camera was implicit to his comments. And nowadays, gps antennas, motion detectors, and the like. Things “that could be done” five and ten years ago just weren’t there of because, well, the instant connectivity — and the integration of connectivity into the applications and related software — suddenly makes it seem like an obvious thing, not just loading the handheld into a dock and syncing it with your desktop.

It’s tax time, and the Government of Canada supports linux!

Doing a bit of research for tax-time, I went to Service Canada’s website to get some extra information needed. I finally figured out how to navigate through some pages, and whaddya know, they support two linux distributions: Fedora (they added, incorrectly, “Core”) 8 — which of course now is out of date — and Ubuntu 7.1, which I suppose was really 7.10. I suppose to some government person who doesn’t quite understand Ubuntu’s version numbering system, 7.1 and 7.10 are “about the same” — of course, were there any validity at all, it would represent the January 2007 release of Ubuntu, which never existed, as opposed to the October 2007 release. 🙂

I was pleased to see them finally picking up the slack, even if this was put in place about 2 years ago. 🙂

And of course, here’s the screenshot, with the appropriate areas highlighted.

Service Canada Supports Linux!

Cool (or mundane) computer trick impresses co-worker

I managed to impress someone at the office this week with a cool (read mundane) computer trick.

I got a call from the secretary, who is a few seconds’ walk from my desk, asking for a scanned version of my hand-written signature. I replied that on my computer at home I have it, and I could easily get it within a few minutes; she replies that it would be faster for her to just walk over with a piece of paper for me to sign, which she would then scan and play around with.

And this is where I began to impress her: By the time she got to my desk with said sheet of paper, I had already VNC’d into my home server’s desktop and was in the process of doing the same from the server to my main computer’s desktop (gotta finish the process of giving it a static IP and setting it up so that I don’t have to go through my home server. 🙂 ) I finished logging into my desktop, and looked in the likely directory, and voilà ! I fired up my home email client, and within a couple of minutes, she’d received my scanned signature.

Beyond the fact that the Gnome desktop is set up standard to do VNC — and the fact that I installed TigerVNC instead of using the standard Gnome Remote Desktop Viewer — too bad that I can’t really claim that this is a cool Linux trick, since my computer at work is Windows, and you can set up Windows boxes to “pick up the phone” too ….

She was still impressed, though. And it took about as much time as the whole process of signing a piece of paper, scanning it, cropping it, etc.

Canola oil instead of petroleum oil car treatments and ethanol blends

I was impressed the other day when I finally got around to rustproofing my car at Antirouille Métropolitain, a chain of rustproofing businesses in Quebec. My car is 13-14 years old and has virtually no rust, although I have to repaint the running board on the driver side yet again, I let things go too long over the past few months so the rust is starting up, but it’s not bad at all. Yet.

They asked me “do you want the traditional oil based treatment or the “bio” treatment? It’s dripless and made of canola oil.”

Apparently the selling point with most people was that it’s dripless, vs. their traditional oil treatment, for which the optimum formula is necessarily drippy. For me the selling point was that it’s canola oil, and the dripless part was just a secondary bonus. This doesn’t affect their usual performance guarantees.

After I’d paid and while the technician is prepping my car and even starting the treatment, I asked the man behind the counter “Aren’t you going to tell your technician to use the canola oil treatment?” To my surprise, he replied that their default policy is to treat cars with the canola oil unless the customer expressly asks for the traditional oil treatment, in which case he would then inform the technician to use “the old treatment”.

The story works out that it took three years to develop the product so that its effects would be equivalent to the traditional oil treatment they developed, and they spent the more two years doing road tests before widespread commercialization of the treatment. They started commercializing the treatment in early 2009. Apparently, the canola oil treatment is the overwhelming choice at this location, as well as business wide to varying degrees — no doubt due to some clever marketing and a highly refined counter-level sales pitch that had me sold hook, line and sinker — to the point that it they sell perhaps one or two traditional oil treatment per week, if that; apparently the principal selling point, as mentioned earlier, is that it’s dripless. In urban centres such as Montreal and Quebec City, this is a big selling point because people don’t like having oil drip marks in their driveways and on their garage floors. In somewhat less urban centres such as Sherbrooke, the adoption rate of the canola oil treatment is down to 40% to 60% apparently because the market, having a larger rural clientele, isn’t as likely to have asphalt driveways or concrete garage floors that would be stained by the dripping oil from their rustproofing purchase, and/or seem slower in changing old habits, such as from the “old” mentality (and old sales pitch) that it being drippy is a necessary side-effect of the formulation so that it can have its maximum effect.

So I was quite impressed that the market is slowly shifting away from some “old fashioned” treatments. Now let’s hope that the rest of the formulation doesn’t outweigh the benefits of replacing the petroleum components.

Note that for the past few months I’ve also been making a point of buying gas from Sonic since they seem to be the only mainstream chain of gas stations in Quebec, or at least in the Montreal area, that sells ethanol blends (6%-10%); they also sell biodiesel blends. Sometimes I go really out of my way or plan routes to pass near a Sonic, but usually not much since there happens to be a Sonic minutes away from home. The other Sonic I occasionally frequent is near Drummondville when I happen to be driving that way. There is another along the way west towards the end of the island. Apparently there are a few other gas stations — I presume independents — who also sell methanol blends in my area, although I have yet to locate them.

This part about the gas has been quite the reverse culture shock from Ottawa, where it’s (or was about 12 years ago when I worked there) the unusual case that a gas station either doesn’t sell ethanol blends or at least isn’t within a couple of blocks of one that does; it’s taken me over 12 years to finally get back to making a point of using the ethanol blends.

Now only if the ethanol blends were more available, and the blends were higher; however, a quick check on Wikipedia suggests that most cars with standard gasoline engines can only tolerate up to about 10% ethanol without some kind of adjustment.

PDF’s, Scanning, and File Sizes

I’ve been playing around with PDF’s for the past few weeks and have noticed a very interesting thing: A PDF is *not* a PDF is *not* a PDF is *not* a PDF, ad nauseum, and it would seem, ad infinitum. At least, so it would seem. Part of me almost wonders if the only distinguishing feature of a PDF is the .pdf extension at the end of the file. In “researching” this post I have learned what I knew already; PDF boils down to being simply a container format.

Lately I have been scanning some annual reports from years past for an organization I belong to, and due to the ways xsane 0.997 that comes with Fedora 12 scans pages — which I will concede straight out of the gate I have only explored enough to get it to do what I want and to learn how it does things “its way” — the PDF file sizes are “fairly” large.

In order to find this out, I first found out about one of the quirks in xsane 0.997: Something about the settings with xsane doesn’t have it stop between pages for me to change pages; at least, I haven’t gotten around to finding where the settings in xsane are to have it pause between pages. This is important because my scanner doesn’t have an automatic page feeder. The first page of results of a google search indicate several comments about this problem, but not a solution. At first glance the second page of results is of no help.

So I end up scanning pages one at a time, and then use GhostScript to join them all up at the end to make a single PDF.

Without having added up file sizes, it was obvious that the total size of all the scanned pages at 75 dpi and in black and white was sufficiently larger than the single PDF with all the pages joined. This did not bother me since, again without having added things up, the difference didn’t seem *too* great, and I assumed that the savings were principally due to adminstrative redundancies being eliminated by having one “container” as opposed to having 25 to 30 “containers” for each individual page.

Then this week a curious thing occurred: I scanned a six page magazine article, and then separately, another two page magazine article, at 100 dpi and colour, and whaddya know, the combined PDF of each set is smaller than any of the original source files. Significantly so. In fact, the largest page from the first set of six pages is double the size of the final integrated PDF, and in the case of the second set of two pages, each of the original pages are triple the size of the combined PDF. I’m blown away.

Discussing this with someone who knows the insides of computers way more than I, I learn something: It would appear that xsane probably creates PDF’s using the TIFF format (for image quality) as opposed to what I imagine Ghostscript does when joining files, which would seem to be to do what it can to reduce filesizes, and as such in this case I imagine convert the TIFF’s inside the PDF’s into JPEG’s. A bit of googling indeed appears to associate tiffs and PDF’s when it comes to xsane; indeed a check on the “multipage” settings shows three output file formats — PDF, PostScript and TIFF. And looking in Preferences/Setup/Filetype under the TIFF Zip Compression Rate, it’s set at 6 out of 9.

So I google PDF sizing, and one result led me to an explanation of the difference between using “Save” and “Save As …” options when editing a PDF: “Save” will typically append metadata on top of metadata (including *not* replacing the expired metadata in the “same” fields!); “Save As”, well, that’s what you really want to do to avoid a bloated file since all that should be will be replaced.

Another result begins describing (what is no doubt but a taste of) the various possible settings in a PDF file, and how using a given PDF editing application, you can go through a PDF, remove some setings, correct others, etc., and reduce the size of PDF’s by essentially eliminating redundant or situationally irrelevant — such as fields with null values — information whose presence would have the effect of bloating the file unecessarily.

I’ve known for a few years that PDF’s are a funny beast by nature when it comes to size: For me the best example by far used to be the use of “non-standard fonts” in the source file, oh say any open-source font that isn’t in the standard list of “don’t bother embedding the font since we all know that nine out of ten computers on the planet has it”. In and of itself this isn’t a problem; why not allow for file size savings when it is a reasonable presumption that many text PDF’s are based on a known set of fonts, and most people have said known set of fonts installed already on their system. However, when one uses a non-standard font or uses one of the tenth computers, when one constantly creates four to 6 page PDF text documents ten times the size of source documents, frustration sets in; having wondered if designating a font substitution along the lines of “use a Roman font such as Times New Roman” when such a font is used — such as in my case, Liberation Serif or occasionally Nimbus Roman No9 L — I asked my “person in the know”. Apparently, Fedora 12’s default GhostScript install, whose settings I have not modified, seems to do just that.

I guess what really gets me about this is how complicated the PDF standard must be, and how wildly variable the implementations are — at least, given that Adobe licences PDF creation for free provided that the implementations respect the complete standard — or more to the point, how wildly variable the assumptions and settings are in all sorts of software when creating a PDF. I bet that were I to take the same source and change one thing such as equipment or software that the results would be wildly different.

So, concurrent to the above scanning project, I happened to experiment with a portable scanner — a fun challenge in and of itself to make it work, but it did without “too much fuss”. And I found out something interesting, which I knew had nothing to do with PDF’s but (I presume) rather with scanners, drivers, and xsane. I tried scanning some pages of one of the said annual reports with the portable scanner on an identical Fedora 12 setup using xsane, and the PDF’s that were produced were far greater in size than those scanned with my desktop flatbed scanner. My flatbed scanner would scan the text and the page immediately surrounding the text, but correctly identified the “blank” part of the page as being blank, and did not scan in those areas, thereby significantly reducing the image scanned size. The other scanner, a portable model, did no such thing and created images from the whole page, blank spaces rendered, in this case, to a dull grey and all, thereby creating significantly larger PDF files than the scans of the same pages created on my flatbed scanner. However, as I mentioned, I assume that this is a function of the individual scanners and their drivers, and possibly how xsane interacts with them, and in my mind is not a function per se of how xsane creates PDF files.

Another interesting lesson.

AT&T does it again! (AKA Will I Ever Learn?)

So I just turned on my TV and here’s a commercial … family dinner … It’s Mom’s tablecloth … Back in the day my grandmother made this for me, they don’t make them like they used to anymore … pass the spaghetti … OOOPS! — NO, WAIT! Don’t do anything!

And they all naturally go to the net to look for a solution (peroxide and something else, everyone in internet cafés and schools around the world yell at their computer screens.) And what does the computer screen look like?

A vague resemblance to the Gnome desktop under Ubuntu, with the white toolbars on top and bottom with hints of brown here and there, but it’s just a touch too blurry to identify it as anything other than NOT Windows, and that it’s probably MovieOS.

I guess that every time they shoot a commercial, the geeky “I use linux at home, I’d love to have the bragging rights to *that* computer in the TV commercial” IT guy in the back is on their day off, or they don’t want to give Gnome or KDE a financial nod. Yet they want to go to the trouble of avoiding an MS or Apple desktop. Interesting.

(sigh …)