Googlebot Requested Another JS File

Back in January, I blogged about how Googlebot requested CSS and JS files. Ever since the news broke (to much fanfare), this front of SEO has been awfully quiet.

Well, GBot is at it again. This time it requested just a JS file and followed a 302 redirect to get it. The JS is embdeded into the page with this code:

<script type="text/javascript" src="filename.js"></script>

The request for filename.js 302 forwards to another location which returns a 404, which I log. Yes, it's a trap designed to see who requests JS files and who doesn't (as part of my on-going bot research documented the past few posts).

The hit details are:

  • Remote: crawl-66-249-70-178.googlebot.com (66.249.70.178)
  • Using HTTP/1.1 GET
  • UA: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)

The second request (the one GETting the URL in the 302 forwarding) came 1 second after the first request with identical hit settings.

And that's about it from me. Anyone else seeing similar hits?

Yahoo! SERPs Embed API RSS Feed

Something I've never seen before. In Firefox, I noticed that the Yahoo! SERPs are now showing the feed icon. For example, the search for [seo blog] embeds an API-based RSS feed of the query. Interestingly, the app identification (a required field for all API calls, and is supposed to be developer specific) is "yahoosearchwebrss".

I tried this for a few searches, both simple searches, like [seo blog], and more advanced ones like [inurl:ekstreme.com]. It's showing up for the UK-locali(s)zed old SEPRs interface and the newly redesigned one.

I'd like to say this is a very cool idea, one to be filed under "Why didn't I think of that?" Well done Yahoo! for another innovation in SERPs.

Programmers: What’s your font?

An odd question to ask, but seriously: which font do you use in your text editor? Anything exotic? Is it monospaced or variable width? Serifed or not?

I ask because... well, no reason really. Just curious about what unsung heros we use. I use Verdana because it's easier to read, and I'm now experimenting with Segoe UI. Never got into the monospaced fonts, even for programming.

So try changing your font today. You might just like it!

How to Crawl the Web like Slurp, How to Block Such Scrapers, and Explaining Weird Slurp Requests

Yet another bot-related post. The subject this time is Yahoo!, specifically Yahoo!'s crawler Slurp and the Babelfish translation service. The topic is weak SE bot authentication, specifically, the ability to scrape content from sites that use a weak form of Slurp authentication. Also, while researching this, I came up with an explanation for weird hits coming from Yahoo's IP addresses.

All search engines introduced a mechanism to authenticate their bots. Yahoo! said back in June that all Slurp hits will come from *.crawl.yahoo.net, but my recent observations showed that's not the case. Still, it doesn't matter for scraping purposes as there is a way to mimic Slurp's behavior almost 100%, perhaps enough to get away with it.

To see it in action, using Firefox and the User Agent Switcher extension, set your UA to Slurp's namely "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)". Next, go to Babelfish and type your website's URL choose a language pair, and translate the page. The bottom frame is the one we're interested in, so right click on that and choose "This Frame->Show only this frame". Next check your log files.

In eKstreme.com's case translated into French, the frame's URL is this:

http://74.6.146.244/babelfish/translate_url_content?.intl=us&lp=en_fr&trurl=http%3A%2F%2Fekstreme.com

And the interesting hit details:

Remote host: proxy3.search.scd.yahoo.net (66.94.237.142)

UA: Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp) (via babelfish.yahoo.com)

Notice how close this is to a real Slurp hit: it has the word 'Slurp' in the user agent and comes from yahoo.net. So any (weak) Slurp authentication checking simply for Slurp and yahoo.net will be fooled. However, there are two key differences to allow for proper authentication:

  • The hit is not from *.crawl.yahoo.net, but from *.scd.yahoo.net.
  • The user agent has "(via babelfish.yahoo.com)" appended.

So in short: if you really want to authenticate Slurp, really do check for *crawl.yahoo.net in the remote host, not simply yahoo.net. Incidentally, that's the advice Yahoo! gives, so follow it!

There is more to this story. When Babelfish translates a page, it requests the URL to be translated twice. The first one is a HTTP HEAD request, and all being well (like not getting a 404 error), the page is properly requested using HTTP GET, which is what I described above. If an error is encountered, the GET request is not sent.

The HEAD request is very interesting as it sets the UA to be identical to the browser requesting the translation, without adding "(via babelfish.yahoo.com)". So what the hit will appear as in log files is a UA coming from *.yahoo.net. Which UAs can you see? Anything out there: I've seen bots like Sogu, AdSense Mediabot, IE, and Firefox. If you spoof your browser to be Slurp as described above, you'll get a Yahoo! Slurp request coming from *.scd.yahoo.net not from *.crawl.yahoo.net, meaning that the Slurp authentication will fail.

This, I believe, explains a question someone posted a year ago at Webmaster World. Following on from a different thread, Yahoo_Mike explained what was going on but didn't explain the details of the double-requests.

So in summary, three things:

  • Double check how you authenticate Slurp and make sure you're doing it properly to avoid scrapers.
  • We now have an explanation of the weird behavior seen from *.yahoo.net proxies with details about exactly what's going on.
  • Why doesn't Babelfish identify itself with the HEAD requests? Come on Yahoo!, you can do better!

Google Web Accelerator: Please identify yourself

A while back, I noticed some fishy bot-like behavior coming from a Google-owned IP address. After asking around, a friend suggested it could be Google Accelerator. So I emailed Google support and, cut a long story short, it indeed was Google Accelerator (GWA for short).

The IP address back then was 64.233.172.34, which Google confirmed to be a public-facing GWA IP address. The hits were very bot-like: no referer, requesting pages blocked by robots.txt, and identifying themselves using the default user agents for IE or Firefox. However, the hits also showed atypical bot signs: Looking through the log files, I noticed that after the page is requested, all associated files are also requested: the Javascript files, the image files, and the CSS files. Interesting in its own right because remember, the hits are coming from a Google IP address but are really requests from real users - the GWA was acting as a proxy. I hope the implications of this are clear.

Regardless, I dropped it - my question was answered. But now it's back...

Over the past 10 days or so, a new IP address started to show the same pattern. This time, the IP address is 66.249.85.133 and it certainly belongs to Google. It resolves ff-in-f133.google.com and requests using HTTP/1.1 and asking for gzip'ed pages. The requested pages are still ones blocked by robots.txt, identify themselves as IE 6.0 (default user agent), and come in without any referer. However, this time associated JS files are not requested, putting the new behavior firmly in botland.

So far, I've noticed only a few hits, none of which identified themselves as Firefox. Given the history, my best bet at the moment is that it is GWA again on a new IP address, but the lack of JS requests makes me wonder if they also updated the code - maybe for analytics purposes? Regardless, GWA is still acting as a proxy, and so I expect it to identify itself as such. It can easily modify the user agent to hint that it's there. At the very least, it will be useful for analytics; examples of why identification is useful:

  • How many GWA requests does your site get?
  • Are GWA requests labelled as bots and discounted?
  • Should GWA requests be labelled as bots? This is more philosophical than technical.
  • Can GWA be used to scrape websites?

And of course, many more questions. So if anyone works for Google maybe you can spare a minute for this? :D

Firefox Extension Spying on Us? - Updated

Update: no more database logging. Details at bottom of this post...

The world of SEO went all smiling a few days ago with 97th Floor publishing their Social Media for Firefox extension. I think it's a great idea; Chris thinks so too, and SEOMoz are terribly excited by it. But it's spying on us. Let me explain.

Open up a new web page, say http://ekstreme.com/, make sure the SM for Firefox extension is in manual mode, and open the Live HTTP Headers extension. Now click the Manual button in the SM for Firefox extension and watch the headers scroll by.

You should see a few blocks of text: one for Digg, one for delicious, one for Stumble Upon, and one for reddit. The last request though is a request to the 97th Floor website. In eKstreme.com's case, the URL is:

http://www.97thfloor.com/social-media-for-firefox/put.php?url=http%3A%2F%2Fekstreme.com%2F&service3=3&service1=2&service4=0&service5=0

and the full headers are:

----------------------------------------------------------

http://www.97thfloor.com/social-media-for-firefox/put.php?url=http%3A%2F%2Fekstreme.com%2F&service3=3&service1=2&service4=0&service5=0

GET /social-media-for-firefox/put.php?url=http%3A%2F%2Fekstreme.com%2F&service3=3&service1=2&service4=0&service5=0 HTTP/1.1

Host: www.97thfloor.com

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6

Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5

Accept-Language: en-gb,en;q=0.5

Accept-Encoding: gzip,deflate

Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

Keep-Alive: 300

Connection: keep-alive

Cookie: MintUnique=1; MintUniqueMonth=1188626400; MintUniqueWeek=1189317600

HTTP/1.x 200 OK

Date: Fri, 14 Sep 2007 23:10:41 GMT

Server: Apache/1.3.37 (Unix) mod_fastcgi/2.4.2 mod_auth_passthrough/1.8 mod_log_bytes/1.2 mod_bwlimited/1.4 FrontPage/5.0.2.2635.SR1.2

mod_ssl/2.8.28 OpenSSL/0.9.7a PHP-CGI/0.1b

X-Powered-By: PHP/5.1.6

Keep-Alive: timeout=15, max=100

Connection: Keep-Alive

Transfer-Encoding: chunked

Content-Type: text/html

----------------------------------------------------------

Notice anything fishy? A filename called put.php (put where? A database?) on the 97th Floor website telling it the URL I just requested info for along with some service data. Surely you're not spying on our social media activities 97th Floor... are you?

You'll notice from the headers that the put.php file returns text/html. What is the HTML? Browsing to the URL returns a blank page with one word: "Done". Done what, my dear? Logged the data into the database have we?

And are you tracking the hits with Mint too? Very slick.

So with all due respect, the extension is now uninstalled untill we get a clear explanation from 97th Floor. Come on, the, errr, floor is all yours.


Update

After blogging the details above, I emailed a few people as a sanity check and to raise the alert. One of the people I emailed got in touch with Chris Bennett of 97th Floor, and so Chris emailed me and commented below. The summary of our discussions:

  • Yes there was data logging, but it was error logging. The data being sent via the URL is consistent with this, an I see no other evidence to shed more light on the question.
  • Chris emailed me a link to the database dump/report. It contained URLs and numbers associated with each of reddit, Digg, delicious, and SU for each URL. The download was huge - I stopped it at ~7MB.
  • Most of the URLs I saw in the database are harmless: news sites, blogs, etc.
  • Some of the URLs were bad to have in there: I didn't know this, but Google apps apparently has some URLs with usernames attached. There are other web apps like that. It's generally a bad idea to tie a username to a login URL (i.e., giving a cracker half the info they need...), but the system still won't log you in automatically.
  • Some URLs are really dangerous to have. Some login systems have a step in the login process that creates a unique URL associated with that session. Anyone who knows this (very hard to guess) URL, is logged in automatically, without a password being asked again. Yes, there was at least one URL on such a system in the database.
  • The bad URLs were logged when people left the extension in automatic mode.

So what's the conclusion: given what I know (all summed up above), how Chris reacted and how other people I know and trust said about Chris, my opinion is that this is an innocent mistake that had serious consequence. There is no evidence of malice that I know of, and regardless, it's now fixed.

Less than 24 hours of me blogging the post, Chris has now released an updated extension and an apology for the whole thing. I installed the new extension and so no 'phone-home' activity in four different test URLs.

Chris should be commended for his quick and decisive response. I for one am happy to move on. But for everyone out there, the usual 'keep your eyes open' warning always applies. Next time it won't be someone who fixes the problem.

Google Acquires Strategic Stake in NASA

The NY Times has just announced that Google has acquired a strategic stake in NASA. Described as a "partnership of equals", the move sees Google paying $1.3 million a year to NASA. In exchange, Google founders Page and Brin get to park their private jet on a federally managed airfield 4 miles down the road, as clearly shown by the map below:


View Larger Map

In return, NASA gets three new airplanes to help its non-existent fleet of air-borne experimental platforms. Additionally, it was confirmed that all future NASA missions will be powered by Google Sky, Google Moon, and others. NASA's JPL will be tranformed into a real-time PageRank calculation datacenter.

My Super-Duper SEO Team

So Donna tagged me with the task of listing the people I would choose for my crack team of SEOs. The point is to list up to 7 people. Team building is a great exercise to make you really think about mutually-exclusive-collectively-exhaustive skills and personalities you think can work well together. So in no particular order...

  • Aaron Wall and Michael Gray for their unrelenting accurate reporting of Google's hypocrisy and arrogance. Anyone who reads this blog knows that I'm 100% in agreement with them. Of course, no one could wish for better SEOs on the team than both of these gents.
  • John Mueller: a very smart guy who recently sold his soul to the devil joined Google. He's an expert experimenter with a true passion to figure out the SEs. In life you don't need to know everything, but you do need to know how to figure things out.
  • Matt Webb. Generally expert SEO and smart guy. He likes Linux a lot, so would useful for his server admin skills.
  • Kim Krause Berg for the ultimate resource in website usability.
  • Chris Winfield as he knows social media inside out.
  • Finally, we need some advnaced/darker-than-white SEO tactics. Eli from Blue Hat SEO wins on this front. Why Eli? For generously sharing and explaining in simple terms some clever tactics. Very useful for new-commers.

And this completes the seven. If I had more space, there are a few others I'd like to have join up . So now I get to tag some folks, so come on Kim, John, and Chris.

Good Bots Gone Bad

I've been keeping a very close watch on bots hitting eKstreme.com lately and I've come up with some interesting observations. Some of them are of great importance to webmasters (like MSNBot's entries) and others are more just FYI. In no particular order:

  • Feedburner's bot does not obey robots.txt, specifically, this command:
    User-agent: * Disallow: /socializer/?
    It sends a HTTP 1.1 HEAD (not the usual GET) requests to the Socializer's bookmarking pages. The remote host is chi-fetch.feedburner.com (66.150.96.121) and the user agent is FeedBurner/1.0 (http://www.FeedBurner.com).
  • MyBlogLog still has an empty user agent string. I posted about this on Cre8 detailing how I emailed MBL support and they promised (within minutes) to forward it to an engineer. The remote host is www1.mbl.sp1.yahoo.com (69.147.90.63). If you browse to that, you get the MyBlogLog home page. Why should you care? Because if you block empty UAs as an anti-scraper method, MBL will get blocked too.
  • MSNBot's authentication doesn't work. Yes, MSN's bot authentication is BROKEN for these IP addresses: 65.54.165.43, 65.55.235.216, 65.54.165.65, and 65.55.233.40. A lot of crawling activity occurs from these addresses, but they resolve to *.phx.gbl (what is that, anyway, Microsoft?!) not to the expected *.search.live.com. Because of this, any crawls from these IP addresses do not authenticate and so are blocked here on eKstreme.com and blogSci.com.
  • Where is Yahoo! Slurp's bot authentication? It was promised way back at the end of March, but I'm still seeing about half of the Slurp requests from *.inktomisearch.com in addition to the promised *.crawl.yahoo.net that allows for authentication.
  • Tailrank's bot does not obey robots.txt. I emailed them a while back and they promised the next (then-imminent) update will fix that, but nope, not yet. Tailrank still rummages through eKstreme.com without regard to how it should behave.
  • Techmeme's bot does not identify itself. The remote host comes up as techmeme.com (75.126.195.146) and the user agent is Mozilla/5.0 (compatible; Wazzup1.0.7613; http://70.86.131.10/Wazzup).
  • There is a lot of crawling activity from *.amazonaws.com. A quick background: Amazon is developing a whole great set of platform services called Amazon Web Services (AWS). I recommend everyone read up on them, especially EC2 and S3. The AWS crawling activity is mostly web startups looking to index the web or feeds - so fine. What I'm seeing is more and more evidence for scrapers running off AWS, which is not good. How to deal with this is a tough one: block everything from *.amazonaws.com or should Amazon personally identify the account holder using the remote host (like accountname.amazonaws.com)? The latter will strongly discourage scrapers and help in authenticating bots. If the scrapers get more frequent, everything on AWS will be blocked, including the good ones. Amazon needs to act now before this becomes a problem that affects all their customers.
  • More on MSNBot, sometimes it doesn't obey the robots.txt file. Hits from 65.55.208.139 are ignoring this command:
    User-agent: * Disallow: /dev
    That's only started in the last few days. Weird.

I'm sure there is more I've missed, but this will do for now. I'll leave you with a parting thought perhaps hinting at the future of search at MSN/Live: the HTTP_ACCEPT header MSNBot sends with all requests is this:

text/html, text/plain, text/xml, application/*, Model/vnd.dwf, drawing/x-dwf

This is very interesting because:

  • application/* is all applications and binary files, including ZIP files. We talked about how Google is indexing (badly) binary data and how that's showing up in the SERPs. What are MSN/Live's plans there I wonder? It could be simply to index PDF files so the question is, can MSNBot 'see inside' ZIP files?
  • Model/vnd.dwf and drawing/x-dwf: Very interesting. Model/vnd.dwf is Autodesk Design Web Format and so is drawing/x-dwf, which as far as can understand are text-based representation of designs for web delivery. Will we start seeing AutoCAD designs in Live Image Search soon? As I like to say, this is "fertile ground for speculation" ;)

Socializer Update

A quick update to the Socializer:

  • Mister Wong has been added to the services, in the Top Services section. Why? They're very large in Europe with multi-language support and a dedicated user base.
  • Netvouz was moved into the Top Services section. I did a quick check of the rankings (like this) and it certainly deserves it. Heck, it out-ranked Slashdot! I've kept Slashdot in the Top Services section because of their new Firehose service. I'm going to be keeping an eye on it.
  • The web-marketer's Digg, Sphinn, has been added to the list. It certainly a niche service, so it's not grouped as a Top Service. However, I'm going to be keeping a very strong eye on it and PlugIM. Depending on the rankings, a switch might be in order. We'll see.

We now return to our regularly scheduled silence :) Yes, I'm very sorry for the lack of posts but I'm working on a new service that will launch shortly. It's been taking up quite a bit of time. More soon - like this weekend soon!

BBC iPlayer: More bugs than a door mat

I just found my invite to the new BBC iPlayer beta service. I got very excited to (finally!) be part of a major piece of online video history and man oh man is it bad. It's so bad, it makes Windows 3.11 look awesome.

Just how bad? Let's start at the top.

  • I copied over the link to a new Firefox tab and it asked for my log in details. It uses standard HTTP authentication, so I got a modal window that I cannot make go away interfering with me. Why do I want it to go away? Because I want to copy my username and password. Just so you understand why this is necessary, my username is FBArbkNad (actually, I changed one letter only for security, but that's pretty much it). What kind of username is that? They can at least use the email address as a username. I notice that in the top right hand corner, it has a 'sign in' link. Surely it knows I've signed in, no?
  • Fine, I manage to login after a copy/paste shuffle with notepad. Great interface. Very shiny, very sleek, a piece of art. Well done on that front. Browsing through, I picked a program at random and I was offered a download link. I click that, and errr, ummm, an error message. Remember Firefox? Yeah, it's not supported. The service requires Windows XP and IE and Windows Media Player. I scored 2 out of 3.
  • Fine. Sheesh, I'll load up IE. Again, copy/paste the URL and neatly, I was browsing at the correct page. At least the URLs are emailable as they point to the correct episode pages. So I click download and it asks me to login again. It's a different server (download.bbc.co.uk instead of www.bbc.co.uk) and it's a setup file. Fair enough, a player is needed. Save it and install it.
  • True to any bad installer, it tried to register something for startup in the registry. No, thank you (but thank you Spybot S&D), just run. When the player opened, it tried to register the same thing again. Does this remind you of RealPlayer and why it's so hated? To make it worse, the name of the file it's trying to register is "koh", making it completely obscure that it's related to the BBC iPlayer. Rename the file, willya? As my friends' resident geek, I get to deal with a lot of spyware and adware. You could at least help me by naming the files properly.
  • Hey, the player works. Very informatively, it tells me that my library is empty (remember, I asked to download a specific episode). Handily, it offers a link to the iPlayer home. Click that and a new IE7 starts. Now I'm angry: it refused to let me download the player from Firefox only for the player to turn around and start a new IE window? Why can't the player be downloadable from any Windows browser (it is Windows only as far as I can tell...) and then open as many IE windows as possible. This is very bad usability, not to mention annoying.
  • So now I'm on my third tab of the iPlayer website (the Firefox one, my IE one, and the iPlayer-started one). Fine. Just let me see a stupid episode. I browse around and click one at random... Oh, I need to log in again. Oh flippin' 'eck. I copy and paste the username and password again... and it says my password is too short. Umm... it your freaking password!

So I give up and decide to write this rant review. Honestly. I've seen alpha software that's better quality than this. Sorry Beeb. I love you, I love your site, but the iPlayer is broken.

Technorati Tags:

eKstreme.com Reloaded

As you may have noticed, eKstreme.com now sports a brand new design. Not only that, it's also been ported to a new backend, namely WordPress. On top of that, it's on a new host now that promises to be be more reliable and a whole lot faster.

Yep, a hat-trick of news that's been 3 months in the making. If you ever wondered why I haven't been replying to emails much, or why I haven't been blogging much lately, this is why. A lot of effort went into this, and I'm sooooo relieved it's over... almost.

Things will be broken! I know that. Firstly, the tools are complex beasts to run completely in WP. Yet they do, for the most part. Two tools are broken at the moment: the spell checker and the Backlink Social Celebrity tool. The first one is waiting on some software to be installed on the server, so it might take a few days. The latter is just sub-par coding on my part and I decided that instead of trying to fix it, I'd re-write the thing. Rest assured I'm on the case and I'll have them up ASAP.

Having said this, please file bug reports and feedback using the contact form.

Speaking of which, there are two people I'd like to thank wholeheartedly:

  • Joe Dolson who patiently gave constant feedback on the design as it was evolving (slowly) and helped troubleshoot some DNS issues.
  • Mike Cherim because eKstreme.com now uses the WP port of his contact form. It was very easy to add it in and it works a treat.

So, thanks to all the readers and users of eKstreme.com. This redesign was done with you in mind. As always, feedback welcome, and there is a LOT more to come now!

Free Lunch 404

Time Magazine published a nice article by Bill Tancer (of Hitwise fame) talking about the top 10000 keywords searched for that contain the word 'free' like [free games] or [free myspace layouts]. The analysis is very interesting for anyone into keyword research but amusingly, he found that no one is searching for [free lunch].

It's not long, and well worth a read.

Technorati Tags: , ,

How to Promote Your Killer Content and Pick up Links Along the Way

Note: This is the first ever guest blog post on things of sorts. If you would like to guest blog here, please drop me a line with your idea or you can also email me. Since this is an experiment (I've asked a couple of others to guest blog too), I would love to hear your comments. If you have a favorite topic you would like to see a post on, you can also ask!

Today we have Rand from 14th Colony talking about content promotion. Apart from formatting the post and pasting it into WordPress, the whole thing is Rand's.


It seems every SEO blog and forum says the key to getting backlinks and rankings is to "write great content". But even the best, most insightful, unique content won’t do anything for your site if nobody reads it. What is “great content” and how do you actually get the links you want after the content is published?

Great content is unique, informative and insightful information presented in a way your intended audience will be receptive to it.

What great content is not is regurgitated information that is already on a thousand other websites. Even if you are presenting information that is common knowledge just by putting your own personal stamp on it you may have something noteworthy. And noteworthy content is what picks up links.

Tips to make "average" content noteworthy

  • Package the information to match your target market. An extreme example of this is websites that translate documents from one language to another. A less extreme example is repackaging a business article for working moms – the tone of the article will be completely different making the information more valuable.
  • Fill in the blanks. Professional articles in every industry assume the viewer knows all the background needed to understand their point. This just isn’t true. This post is an example of filling in a blank: how to promote articles to get links.
  • Be ahead of the curve. If you put together information before anyone else does your article will be cited as the source document for all other following. An example of that is my article on Social Bookmarking that came out when that phenomenon was still pretty new. Pierre included a link referencing it to explain his article on Social Bookmarking Code.
  • Make it personal. If you can show that you have been in the situation the viewer is facing and made it through you develop a bond with them. Having clear examples and illustrations also help.

And it probably goes without saying but I just have to push this point: proof read, edit, spell check, edit, read it out loud, edit, edit, edit. Spelling and grammar errors can drop the viewer’s trust in a heartbeat. Fix them ahead of time.

Getting the word out

Once you post your carefully crafted, super edited, insightful, unique article it is time to promote it.

  • Tell your friends and colleagues. Send out emails to people you know. Your personal relationships are easiest to connect with and they already have a vested interest in your success.
  • Ask industry leaders. Be very polite and do not attach any expectations. Industry leaders are who they are because they are busy. Often asking for criticism is the way to go as it appeals to their ego or sense of "giving back".
  • Announce the article in a forum. Be sure to comply with the rules and culture of the forum. You can also highlight the article in your signature file. I have had great success with this picking up links long after the initial "buzz" wore off.
  • Reference the article on your blog. If you don’t have a blog, get one. They are great vehicles for promotion. This will also get the word out through your RSS feed.
  • Reference your article in blog comments. These links don’t count for the search engines because of the nofollow issue but the traffic you pick up may lead to some strong links on other sites.
  • Leverage Social Media. Many social bookmarking sites use nofollow but some don’t and the exposure your article gets – even if it doesn’t hit the home page – is often enough to get you some links along the way. Don’t submit everything you write, just your best stuff, and have a friend submit the articles for you (Diggers don’t like self-promotion).
  • Write a press release. Getting your article mentioned by industry newsletters and websites can be a big boon.
  • Submit to directories that offer deep links. Most directories only link to the home page but some will link to any page in your website that you choose. You can do a search for [keyword +submit] or [keyword +"add url"] to find sites like these.
  • Exchange links. Reciprocal linking in limited doses is ok. Just don’t get carried away.
  • Offer the article for translation. If you know people in your industry that speak a different language offer to let them translate your article in exchange for a link back.
  • Write smaller articles on your topic for distribution sites. If your article is about "widgets" you can expand on some details and have a new article about "the history of widgets" or "the difference between red widgets and blue widgets" or… whatever. The point is you’ve already done the research so use it to generate some smaller articles that reference your main article.

One to two weeks of solid promotion should give your article the momentum it needs. If you do another round of link-building every 6 months you can have a solid position in the search engines and consistent stream of traffic for years.

Randall McCarley has more than 10 years experience in marketing and website development and promotion. He owns 14th Colony and Linker’s Union and is a moderator at SEO Refugee.

Yahoo! and Google have Strongest Brands

A press release just made public covering research by Penn State's College of Information Sciences and Technology. It's a very succinct write-up, so I'll just quote bits of it:

Researchers in the College of Information Sciences and Technology (IST) copied Google results pages from four different e-commerce queries, ascribing them to four different search engines -- Google, MSN Live Search, Yahoo! and an in-house engine created for the study. Then the researchers showed the pages to 32 study participants who were asked to evaluate the engines’ performance in returning relevant results.

Despite the results pages being identical in content and presentation, participants indicated that Yahoo! and Google outperformed MSN Live Search and the in-house search engine.

Participants ranked results from Yahoo! more relevant across the four queries.

The whole premise of the press release is that this observation is the result of brand power for both Google and Yahoo!. It's an interesting observation, and certainly makes sense, but I'm still not 100% convinced. The sample size is too small and as the researchers noted, "many of the participants said they used Google to search". The very next thing they need to try is to recruit MSN/Live users to do the experiment. If their hypothesis is true, the MSN/Live users would rate MSN's results top.

Regardless, an interesting note that could explain a lot of the momentum behind the top SEs.

Technorati Tags: , , , , ,

Joost Review

This is NOT a paid review. I say this because I've done paid reviews before here.

I managed to get a Joost invite from Last 100, giving me a chance to review what the fuss is all about.

What is Joost? To quote their website:

The new way of watching TV

All the things you love about TV, fused with the interactive power of the internet – just the way you want it. Enjoy the ride!

...which doesn't say much. Really, it's a P2P based TV channels distribution network. Think Kazaa but instead of delivering MP3 files, it's actually streaming video. Very neat way to solve the technical problem of serving huge amounts of content.

Does it work? Well, sort of. The picture quality is on the low end of what you can achieve with TV. It obviously uses up a ton of bandwidth on your end, and so it's very sensitive your connection settings. On a wireless broadband here in the UK, it worked acceptably well, but it was definitely choppy at times, and the sound quality suffered. Interestingly, the video would keep running but the sound would grable or jump. I'm guessing that perhaps the Joost software gives higher priority to the video stream (if video and audio are actually split). Also, when a new channel starts playing, the quality is terrible but quickly moves up to good.

Also, the software is very - let me say this again, VERY - demanding. On my dual-core 1.8GHz computer, the whole computer seemed to slow down. The mouse would jump, the screen would take a while to refresh and all other symptoms of high CPU usage. Task manager showed that Joost is using 60% of the CPU all the time.

Content-wise, it's a bit thin, at least for what I'm interested in. Randomly channel surfing, I came across some kind of UK users generated record breaking channel where ordinary folks decided to break world records. I decided it was not good after watching the first segment featuring five men on all fours racing like greyhounds, on a dog race track, chasing the same rabbit the dogs usually chase. Thank you, but YouTube is already filled with such crap. However, this seems to have triggered a bug because everything I would watch afterwards was a repeating cycle of ads: a car, HP, then Nike, then Intel, then back to the car advert and repeat.

The interface is also a bit... mysterious. Lots of buttons and animations but you really have to hover over everything to figure out what it really does. Coupled to the general slow-down I mentioned earlier, I routinely double clicked everything early on because I thought my first click didn't register. The interface looks like it was designed in Flash. However, the controls are small enough not to distract too much from the actual content, a major plus.

By default, Joost wants to run in full-screen mode, and some features demand full-screen functionality. I don't know why it is so picky, but at least you can watch in a window and snap in and out of full screen occasionally. I wouldn't watch full screen given the low quality video.

So big picture: is it a Good Thing? Yes. For all its technological and user experience shortcomings, Joost will undoubtedly grow to be a major force in online video. If they decide to allow ANYONE to broadcast content (like YouTube), then they really could change the game. Can you imagine stringing a bunch of video blogs and creating a channel? THAT would be awesome - your own personalized channel. They would need an API to simplify subscriptions and easy manipulation of channels, but it's not much of a jump given what they've already done.

So: watch this space (if you'll excuse the pun).

Finally: if anyone wants Joost invites, drop a comment below and I'll send one to your email address. I have to do the invites manually, one at a time, so if a lot of people ask, please be patient :).

Technorati Tags: , ,

Online Survey Proves I’m Nerdy

I saw it on the Internet. It must be true.

So Michael tagged me with the latest meme going around: taking a nerd test to get a score (of course). And my score? Let me use the test's words: "I am nerdier than 92% of all people.". That's right folks, I know my periodic table inside out and can recognize photos of great scientists who passed away hundreds of years ago. I even get a badge:

I am nerdier than 92% of all people. Are you a nerd? Click here to find out!

Apparently, this beats the highest score Michael knows about. If you'll excuse me, I need to go have a walk or something.

So now I get to choose some people to have a go at beating me. Let's hear it for:

  • Joe
  • Sophie (wonder if switching to a mac has affected her)
  • Kim

Building a Website on a Budget

Sophie Wegat, the brains behind Think Prospect of Melbourne, Australia, just had an article of hers published in My Business, an Australian business magazine. The article is very easy to read and explains the basics of building a website on a budget is detail. Articles like these are great for beginners in SEO and web design, and also for the professionals who would like a prop to help them explain the process to their beginner clients.

Sophie just blogged about her article linking to a PDF download: Building a Website on a Budget. Well done Sophie!

Social Bookmarking How-to

In the past few weeks I saw the same question asked twice: how do I choose which social bookmarking sites to use on my blog or website? Here is my explanation of how different sites do it and my suggestion of 'best practice'.

There are three approaches I've seen on the web:

  • The 'Link Series' approach: a series of links, sometimes with images, dumped on the page. Honestly, I think this is a waste of space as the CTR of these links is minimal. This is what I had on eKstreme.com and what led me to create the Socializer in the first place. If you go down this route, keep an eye on what's going. I suggest you hack the AJAX link tracker.
  • The 'Chosen Few' approach which links to only a select few of the myriad of social sites out there. If you're a shopping site, you can submit to social shopping sites. Science-type sites can use Conntea, which is a delicious-type service for academics. Techie sites go for digg and slashdot. You get the idea.
  • The Socializer approach, which you know well. Some people use it on its own, and that could be the wrong way to do it. There are two really good ways to use the Socializer:
    • If you use just the Socializer, user a very descriptive anchor text. For example, on my Science Blog, I use the anchor text "Social bookmark post (digg, delicious, reddit, etc)". I also use the Socializer logo as I've noticed having the image increases CTR.
    • Another excellent variation I've seen is that you link to very few sites (say, delicious and reddit) and then use the Socializer as "the rest" link. I like this approach, as it gives easy access to the social sites your visitors use most often, but covers all bases.

For a website that doesn't have an unusual layout (like a blog), I recommend the last variation. However, test, test, and then test some more!

The real question is how do you choose which services to list? For each site, test the various services. However, perhaps the strongest hint is your log files: which services refer hits to you the most? These are these are the services your visitors are already using. This data is even more interesting if your site does not already have social bookmarking links; in this case, the log files tell you the most popular services your visitors use without any bias from your help.

Finally, a comment about location. I think, although I haven't tested this extensively, is that you should place your socializing links in a consistent manner throughout the site at the bottom of each page or blog post. I've also seen some very successful links placed at the top of the pages. Also, my experience is that using the services' logos helps the CTR a lot, in addition to descriptive text if you can squeeze some in.

And there you have it. Questions or comments below :)

Technorati Tags: , ,

First Look at Minimo

I can't remember how I got to finding this, but I found the Mozilla project aimed at Windows-based mobile devices. The project is called minimo. It is based on the same rendering engine that Firefox uses, and so I had to try it! Version 0.2 was recently released, so I downloaded and installed it on my trusty PDA phone, an Orange SPV M600. With the built-in WiFi, testing net browsing is very easy, and here are the results.

Installation and Home Page

Installation is very easy and uneventful - you download an installer, connect up your device, start the installer and follow the post-synch installation instructions on the handheld. This creates an icon in the Programs list. Click that, and the minimo splash screen starts up with a progress bar - yes it's quite a large program.

minimo splash screenThe splash screen of minimo. minimo home pageThe minimo home page.

The default home page of minimo is called homebase. It's quite a tidy and neatly arranged area. At the very top, there is a URL address entry bar with a Google search bar immediately underneath. Below is a list of mobile-friendly services, such as the del.icio.us mobile RSS feed, weather, and Google maps. As you browse, your history gets added to this page, a very cool feature, and certainly a time-saver.

Browsing and Tabs

This is the coolest feature of minimo: built-in tabbed browsing support. When I read about this, my gut reaction was that the screen is too small to have more than one tab. Well, let me tell you that this was a very wrong assumption. The tabs are somehow nicely squeezed in, and make mobile web browsing a true pleasure. I was able to actually browse for an hour or so without feeling cramped.

The other neat thing about minimo is how it managed to display websites in a way that minimized horizontal scrolling. To take two examples close to heart, I took screen shots of minimo displaying eKstreme.com's home page, the Socializer, and OSNews. Very nice and tidy, but it also depends a lot on the underlying HTML; for example, OSNews has a mobile-friendly version that's automtically served to handhelds. For stubborn sites, there is a panning button in the toolbar at the bottom that allows you to use the stylus to pan around the web page. Again, a great feature for small screens, and built right into the interface. Major kudos.

minimo displaying eKstreme.comeKstreme.com's home page displayed in minimothe Socializer in minimoThe Socializer in minimo. OSNews in minimoOSNews.com in minimo.

Menus and Settings

The minimo interface is very menu driven. If you tap and hold on an empty bit of the page, you get a context menu. Tapping and holding a tab will show the tabs menu. Further, there is a "..." button on the toolbar that displays the global menu bar. Everything in it is straightforward, so I'll just explore the Preferences menu a bit more.

The Preferences are arranged in five categories symbolized by five colored buttons. The settings are easy to understand and change, and once done, just tap the save button. There is one feature worth mentioning though: the SSR. It took me a while to figure out what that means, but eventually, the minimo FAQ explained it best:

What is SSR and how do I use it?

This question has come up several times. SSR adjusts the look and feel of a page via CSS. SSR attempts to adjust image sizes, fonts, and layouts to maximize page space. SSR also attempts to eliminate side scrolling. To use SSR simply click the blue globe and choose SSR. This will attempt to adjust the layout of the page to better fit on your screen.

minimo settingsminimo settingsminimo menuminimo menu.

Bugs

Having said all this, there are a few bugs still be ironed out:

  • When I first ran minimo, it complained of low memory. Problem is, that was the last peep I heard: the whole device crashed. The only way to recover was to remove the battery. However, that was the only crash I saw from it, so perhaps it was a fluke.
  • As you can see from all the screenshots, the text entry bar that shows up by default in Windows Mobile is not visible. It's supposed to be at the bottom of the screen, but somehow, minimo pushes it off or hides it. This means that you cannot change the text entry method once minimo started. Another bug: minimo doesn't work with the Transcriber entry method. Instead of getting your scribbles showing up on the screen and interpreted as text, minimo interprets all screen taps as interactions with the displayed web page. This means you'll be randomly clicking links or even displaying the context menu. This is the most serious bug I've found.
  • Quitting doesn't quit. More specifically, clicking the 'x' button at the top right doesn't close minimo. To quit, you need to click the '...' menu button in the toolbar and choose Quit. That works a treat.
  • Gmail has a mobile-friendly interface that works really well in mobile Internet Explorer. However, in minimo, I got an XML error message.
  • Finally, more of a feature request than a bug: I don't like Google, preferring to use Yahoo! most of the time. I really would like to have a Yahoo! search option built in, especially on Homebase (the home page). Yahoo has a mobile-friendly search, so it shouldn't be too hard to build it in.

Concluding Thoughts

I was very impressed with minimo. Actually, using minimo showed me that mobile net access might just work. Everyone has been predicting the mobile revolution but I never thought it would happen simply because the screen are too small and the interfaces cramped. Somehow, minimo sidesteps all these and it made a convert out of me.

Granted, there are still a few bugs to fix, some of them serious, but remember this is still a development version. Frankly, it's perfectly fine to use on a daily basis as it stands, but I have great hopes for its future releases. The developers should be commended for their hard work!

Minimo Resources

Technorati Tags: ,

« Previous Entries   Next Entries »

Site Navigation

Blog Categories

Popular Pages

The most popular pages on eKstreme.com.

Search

Subscribe

Subscribe to RSS 2.0 feed

Community

 
thermodelly