four million blogs

October 9th, 2005

BlogsNow had a little hickup. An old index issue finally prevented one of it’s main tables to accept new entries. The easiest remedy was to delete the last four million weblogs that had been added.
They will come back, I am sure.

no more ads

September 22nd, 2005

Turned off the ads on BlogsNow. Don’t feel like it right now.

kassandra

September 5th, 2005

This is what I was clicking on on Sunday 4:46pm PDT. Found it on BlogsNow. Which simply means that lots of bloggers had linked to this. Which means that the information was not only there, it also had been read, understood and repeated. Another link of that day was this National Geographic Article from last October.

During last years Tsunami there was allot of talk about the fact that the early warning systems in the Pacific knew about the event before people died.
This seems to be a reoccuring theme. To see that people die, because they lack something as cheap and ubiqitous (sp?) as information is not understandable. We only think that information is free and generally available. It is an illusion. How many bloggers where in the superdome?

Another angle to the same aspect of this very sad story:

Growing up in a front country of the cold war, we were aware of what was needed for which situation. One of the things we always had was a radio that was running on batteries. Even as a kid of 10 years I knew that this would be a crucial tool of information in case something should go wrong.
If only the population of New Orleans would have listened to battery driven radios. Could have. Would have. We hear that allot during these sad days.


URGENT - WEATHER MESSAGE
NATIONAL WEATHER SERVICE NEW ORLEANS LA
1011 AM CDT SUN AUG 28 2005

DEVASTATING DAMAGE EXPECTED

HURRICANE KATRINA
A MOST POWERFUL HURRICANE WITH UNPRECEDENTED
STRENGTH...RIVALING THE INTENSITY OF HURRICANE CAMILLE OF 1969.

MOST OF THE AREA WILL BE UNINHABITABLE FOR WEEKS...PERHAPS LONGER. AT
LEAST ONE HALF OF WELL CONSTRUCTED HOMES WILL HAVE ROOF AND WALL
FAILURE. ALL GABLED ROOFS WILL FAIL...LEAVING THOSE HOMES SEVERELY
DAMAGED OR DESTROYED.

THE MAJORITY OF INDUSTRIAL BUILDINGS WILL BECOME NON FUNCTIONAL.
PARTIAL TO COMPLETE WALL AND ROOF FAILURE IS EXPECTED. ALL WOOD
FRAMED LOW RISING APARTMENT BUILDINGS WILL BE DESTROYED. CONCRETE
BLOCK LOW RISE APARTMENTS WILL SUSTAIN MAJOR DAMAGE...INCLUDING SOME
WALL AND ROOF FAILURE.

HIGH RISE OFFICE AND APARTMENT BUILDINGS WILL SWAY DANGEROUSLY...A
FEW TO THE POINT OF TOTAL COLLAPSE. ALL WINDOWS WILL BLOW OUT.

AIRBORNE DEBRIS WILL BE WIDESPREAD...AND MAY INCLUDE HEAVY ITEMS SUCH
AS HOUSEHOLD APPLIANCES AND EVEN LIGHT VEHICLES. SPORT UTILITY
VEHICLES AND LIGHT TRUCKS WILL BE MOVED. THE BLOWN DEBRIS WILL CREATE
ADDITIONAL DESTRUCTION. PERSONS...PETS...AND LIVESTOCK EXPOSED TO THE
WINDS WILL FACE CERTAIN DEATH IF STRUCK.

POWER OUTAGES WILL LAST FOR WEEKS...AS MOST POWER POLES WILL BE DOWN
AND TRANSFORMERS DESTROYED. WATER SHORTAGES WILL MAKE HUMAN SUFFERING
INCREDIBLE BY MODERN STANDARDS.

THE VAST MAJORITY OF NATIVE TREES WILL BE SNAPPED OR UPROOTED. ONLY
THE HEARTIEST WILL REMAIN STANDING...BUT BE TOTALLY DEFOLIATED. FEW
CROPS WILL REMAIN. LIVESTOCK LEFT EXPOSED TO THE WINDS WILL BE
KILLED.

AN INLAND HURRICANE WIND WARNING IS ISSUED WHEN SUSTAINED WINDS NEAR
HURRICANE FORCE...OR FREQUENT GUSTS AT OR ABOVE HURRICANE FORCE...ARE
CERTAIN WITHIN THE NEXT 12 TO 24 HOURS.

ONCE TROPICAL STORM AND HURRICANE FORCE WINDS ONSET...DO NOT VENTURE
OUTSIDE!

new look

July 9th, 2005

BlogsNow has changed significantly. Now that I have more views active based on Version 2 I changed the design a little bit as well.
The update frequency has been changed to sixty seconds. Why not:
The new database design is pleasently fast.
I also droped the reference to “Version 2″. I certainly looks differently enough for people to realize that this is indeed new.

a sad occasion

July 7th, 2005

This is a sad day. The terror in London is on everybodies mind.

BlogsNow was written to be a fast reflector of what is going on in the world. Right now 64 out of the 100 links are related to the events in London. The top 17 links are all about it.

Here how other tools look right now:

technorati
people search allot for it, but the link list still focuses on yesterdays olympic nomination

blogdex
as usual blogdex has no clue, and it will be like this for a while

daypop
same here

BlogsNow
I wish I could have done this comparision with a more positive event.

3 hours later:
Server crashes. Again. Now I know that it is mysqlhotcopy when making a backup. While running the mysql repair I run out of disk space. All those bin-log files. Then I am stupid again and ctrl-c the repair. It would have waited for disk space. Then I tried to move the mysql data dir to another disk. Which takes some while. Then mysql does not want to start from that disk. Since I have no time for a dive into the manual I move things back and start it again. With the result that BlogsNow now shows results from 24 hours ago. Maybe I should learn something here?

Right now I like blogdex’ Version of the latest news much better. It is yesterdays. Yesterday was better than today.

the new news media

July 7th, 2005

as BlogsNow fills with London links

gizillion blogs

July 7th, 2005

BlogsNow Version 2 started with a clean and new database. During it’s one year of operation Version 1 had seen close to 7 Million weblogs. BlogsNow follows ping lists like most other tools. These list became more and more a resource for spammers to inject their content. BlogsNow Version 2 jumped from three to almost four Million weblogs within one week. It turned out that two IP addresses alone had created 600,000 new ‘blogs’. All of them made just to spam whomever they can.

Many websites tracking weblogs will claim how many weblogs they track. It appears as if those 11 Million you find right now are actually an accumulation of all weblogs seen, regardless if active or not. And, at least a certain amount, of bogus blogs only created for spam should be takn into account.

Those inflated numbers are being used wherever people like to put an extra boost on the blog phenomen. There are definitely millions of blogs.
But the active blogging community may be just a few hundred thousand people.

BlogsNow view on the Zeitgeist

July 4th, 2005

Just added six more views to BlogsNow. Most of the Version 1 views are back now.
And then some.
see for your self

glitch #1

July 1st, 2005

4:20 this morning. The computer that runs BlogsNow freezes.
Not the first time, but it was doing ok in the last 2 months.
Tricky problem to fix: The error frequncy is so low that you would know after a year or two if a change fixed the issue or if you just were lucky. Sigh.

Good thing: mysql and it’s wonderful mysqlcheck auto repair.

Does in minutes what took days on the old server.

Around the world in 100 Links

June 28th, 2005

BlogsNow got it’s first ’special view’ today. I mentioned it before, but this is the first sign of it: I rewrote BlogsNow in Version 2 also to be able to whip up quick views that I feel might be interesting. Google Maps is an amazing web application. And, of course all the images being out there people will find interesting views and share them in their blogs. BlogsNow simply lists the most prominent ones. As usual millions of bloggers will collect some collective filter that is somewhat interesting. Since longitude and latitude coordinates do not mean much to most of us, I listed the closest airports instead.

BlogsNow gets ads

June 22nd, 2005

BlogsNow runs now ads.
I think one on the top is not too much.

Let’s see how this works out.

BlogsNow and the others

June 22nd, 2005

I made a little overview about meme tracking tools

It looks as BlogsNow is one of the internets best kept secrets.

the king is dead, long live the king

June 21st, 2005

blogsnow Version 1 has been turned off. 16 months of solid service. Many hours of coding. People loved it, but now its offline.

Version 2 is the new black. The DNS will be switched right I after I saved this entry. And the rest is history.
Yes, there is stuff missing in Version 2, but on the other side it does update quicker than you can read through it.
Try it.

BlogsNow Version2 one step closer to being done …

June 21st, 2005

BlogsNow Version 2 is getting there.

For now I cut some corners and just put the super fresh data online in the old design.
There are still not more features like different views, but I have a plan for that.

It all should go pretty quickly now. The plot thickens …

BlogsNow Version2: next feature

June 15th, 2005

BlogsNow Version2 just got the next feature: und ‘blgs’ there is a list for a all the blogs linking to a given item.

BlogsNow Version 2: Preview Version live

June 12th, 2005

The first public page of BlogsNow Version 2

The link above will go to a preview page of BlogsNow Version2. It is just the latest links. BlogsNow Version2 became a complete rewrite. None of the old code or data has been used. Just the experience.

BlogSpam is one of the biggest issues for a Meme Tracker like BlogsNow. Rigth now it looks as if 25% of all active blogs are spam. Created by programs, not people. Created to make a quick bug for somebody somewhere.

It was an interesting mental excercise to spend so much coding time on this subject. My spontanous reaction to ’spam’ is that I really hate it. I hate the concept to create huge damages for many people just so that very few have a little financial gain. But being furious is not a good mental state to write code in. At least not for me. So I had to get over it, and just
turn it off. It looks as if it works right now.

Since spam filtering works I could include blogspot.com hosted blogs in BlogsNow Version 2 again. Which is nice,
since there are jsut so many blogs on there. It was a sad day when I was forced to turn the crawl off for it in Version 1.

blogsnow V1 was fast. BlogsNow Version2 is even faster. It runs circles around Version1: An analysis of last 50,000 links
added to the blogosphere (covering something like 5-10 hours right now) takes about 30 seconds. Version1 is busy for about five minutes on the same task.

The Preview page gets a fresh data set every three minutes. Since it can ;-) The Ranking is a mix of number of links and time since the link has been added: Links added right now have full weight, while the last one has no weight. I think I will be playing with the exact recipe for a little bit.

The Preview page has no features. Of course there will be pages with who links to what, etc. I am somewhat undecided on RSS. Version1 had no RSS half it’s life. People got all excited when I added it, but I did not see the use spread or be particularly interesting. I think that many people just ‘collect’ RSS feeds like they do bookmarks. But they actually never go back, since they are busy chasing the next butterfly. So I might as well skip RSS. Except for movies and mp3. Media Enclosures make sense. And yes, BlogsNow will have those lists as well.

Let me know what you think about this little glimpse on the future of BlogsNow Version2.

server crash

May 29th, 2005

It all went to well.

suddenly the server was not reacting. Everything worked till the shell needed to do /anything/ with the disk.
Had to reset it. Sigh.

The syslog said:

May 29 19:14:25 andreaswacker kernel: kswapd0: page allocation failure. order:5, mode:0x50
May 29 19:14:25 andreaswacker kernel: [<0214b29a>] __alloc_pages+0x28b/0x298
May 29 19:14:25 andreaswacker kernel: [<0214b2bf>] __get_free_pages+0x18/0x24
May 29 19:14:25 andreaswacker kernel: [<0214e9c1>] kmem_getpages+0x15/0x94
May 29 19:14:25 andreaswacker kernel: [<0214f74c>] cache_grow+0x155/0x29a
May 29 19:14:25 andreaswacker kernel: [<0214fa9e>] cache_alloc_refill+0x20d/0x23d
May 29 19:14:25 andreaswacker kernel: [<0215004f>] __kmalloc+0x6b/0x7d
May 29 19:14:25 andreaswacker kernel: [<82964f54>] kmem_alloc+0x50/0x96 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829479a1>] xfs_inode_item_format+0xe0/0x239 [xfs]
May 29 19:14:25 andreaswacker kernel: [<8295aaf3>] xfs_trans_fill_vecs+0x3a/0x86 [xfs]
May 29 19:14:25 andreaswacker kernel: [<8295a8a4>] xfs_trans_commit+0x18d/0x300 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829494de>] xfs_iomap_write_allocate+0x248/0x436 [xfs]
May 29 19:14:25 andreaswacker kernel: [<82949520>] xfs_iomap_write_allocate+0x28a/0x436 [xfs]
May 29 19:14:25 andreaswacker kernel: [<0224e474>] generic_make_request+0x190/0x1a0
May 29 19:14:25 andreaswacker kernel: [<829485a3>] xfs_iomap+0x23b/0x3ed [xfs]
May 29 19:14:25 andreaswacker kernel: [<829486ba>] xfs_iomap+0x352/0x3ed [xfs]
May 29 19:14:25 andreaswacker kernel: [<8296c99f>] xfs_bmap+0x1a/0x1e [xfs]
May 29 19:14:25 andreaswacker kernel: [<82965219>] xfs_map_blocks+0x29/0x11e [xfs]
May 29 19:14:25 andreaswacker kernel: [<82965dd9>] xfs_page_state_convert+0x273/0x4e8 [xfs]
May 29 19:14:25 andreaswacker kernel: [<829664cb>] linvfs_writepage+0x91/0xc6 [xfs]
May 29 19:14:25 andreaswacker kernel: [<021522db>] pageout+0x83/0xc0
May 29 19:14:25 andreaswacker kernel: [<02152522>] shrink_list+0x20a/0x547
May 29 19:14:25 andreaswacker kernel: [<02151540>] __pagevec_release+0x15/0x1d
May 29 19:14:25 andreaswacker kernel: [<02152a92>] shrink_cache+0x233/0x4d5
May 29 19:14:25 andreaswacker kernel: [<0215357f>] shrink_zone+0x8f/0x9a
May 29 19:14:25 andreaswacker kernel: [<021538a5>] balance_pgdat+0x176/0x249
May 29 19:14:25 andreaswacker kernel: [<02153a3e>] kswapd+0xc6/0xc8
May 29 19:14:25 andreaswacker kernel: [<02120fcf>] autoremove_wake_function+0x0/0x2d
May 29 19:14:25 andreaswacker kernel: [<02120fcf>] autoremove_wake_function+0x0/0x2d
May 29 19:14:25 andreaswacker kernel: [<02153978>] kswapd+0x0/0xc8
May 29 19:14:25 andreaswacker kernel: [<021041d9>] kernel_thread_helper+0x5/0xb
May 29 19:14:25 andreaswacker kernel: deadlock in kmem_alloc (mode:0x50)
May 29 19:14:25 andreaswacker kernel: possible deadlock in kmem_alloc (mode:0x50)
May 29 19:14:25 andreaswacker last message repeated 85 times

Don’t hope that this is common for Fedora Core3 on a AMD machine with a big array. I start to load the machine with tasks now.
Let’s see if it happens again.
Sure enough Mysql was not happy since I run it with delay-key-write.

Wordpress said:


WordPress database error: [Incorrect key file for table 'wp_comments'; try to repair it]

So I did a


mysqlcheck -pXXXXXX --auto-repair wordpress

which seems to have done the trick.

BlogsNow Version 2 and spam

May 28th, 2005

BlogsNow Version2 is coming along. Instead of moving code and data from Version1 over to this host I decided to write it again. Most changes go into spam detection and filtering.

Right BlogsNow Version 2 flags and ignores -

- spam:
http://midwesternerslavished.blogspot.com/
http://pet-insurance-tips.blogspot.com/
http://guitar-rock.blogspot.com/

- indecent content:
http://spaces.msn.com/members/teen-galleries/
http://spaces.msn.com/members/adult-creampies/

[I thought that spaces had such a tight content filter, apparently not]

- ‘blogs’ that forward directly to porn sites:
http://jasmine-disney-hentai.blogspot.com

There is an ever increasing amount of blogs that only were created for spam purposes.
Right now it looks as if BlogsNow can start crawling blogspot.com blogs again in Version2.

embracing spam

May 11th, 2005

Spammers use blogs. Massively. Spam is always massive.

BlogsNow has to deal with this. Sometimes this sucks, and sometimes it’s actually an interesting challenge: How quickly can the results be cleaned up. Today it was easy:

results before filtering

results after filtering