smart splogs

January 15th, 2006

http://thisasseenontvpetsteps.blogspot.com/
or
http://thisdoggieramp.blogspot.com/

not your usual link collection.

google spams

January 15th, 2006

ok, provokative title. Let’s rephrase: google tolerates spam.

Blogger is owned by google. It runs the biggest blog service on it’s blogspot domain.

It appears to be very simple to create hundrets of thousands of ‘weblogs’ like this:

http://p85.blogspot.com/

Created solely for spam purposes. So called ’splogs’. You set up a robot and there is nothing in the blogger software that stops you from adding all the blogs you like.

This is not new. Google / Blogger / Blogspot knows about it. They did nothing against it in the last years.

It should be relatively easy to make sure that there is a human in front of the computer if a new weblog is created at blogspot.com. Simplecaptchas are very common today.

There are two possible explainations why this did not happen yet:

- blogspot engineering is amazing incapable

or

- there is no real rush to get rid of splogs on googles side.

It might make sense:
You have to forget the “don’t be evil” and “organize the worlds information and make it easily accessible” google dogma’s for a second though. Google knows one thing very very well: how to run a scalable service. They have the lowest cost per stored bit due to their own file system technology. It uses commodity hardware and adds failover management brilliantly. It does cost google not much to host millions of splogs.

But wouldn’t million of false blogs pose a danger to the result-quality of a search engine?

Exactly.

Google knows from which ip address a blog get’s maintained. Nobody else does. They have the actual blog data readily available for further parsing. I doubt that the googlebot comes through the front door to blogspot. The bandwidth alone that you could be saved by crawling blogsport internally should make up for the ‘exception’ that this would mean to the googlebot operations. I don’t know these things. It’s a guess.

Every search engine has to have spam combat tools these days. Google is one of the most useful search engines and in the US they have an ok handle on search engine spam. Isn’t it funny that they don’t use their insider knowledge and acess together with their anti-spam tools to simple turn off splogs on blogspot?

Last October there was somebody that scraped famous blogers sites and reposted that content splogs. That got some attention, and stopped. But splogs did not.

Blogspot hosts lots of splogs. But also lots of legit and very powerful weblogs. Nobody can really afford to ignore the biggest weblog service. Yahoo, Msn and even my little BlogsNow have to crawl blogspot in order to find out what is going on. Google can skip the skip, all others have to deal with it.

There is also a third theory that is the most plausible:

splogs don’t matter to search engines. They have to crawl billions of pages anyway. Who cares about a couple of million spam blogs here and there. That’s probably what it is: The aircraft carrier keeps on going regardless if there are 50% more roaches in the kitchen or not.

those simple passwords

December 15th, 2005


elvis elvis
elvis elvis321
elvis elvis123
elvis 1
elvis 12
elvis 123
elvis 1234
elvis 12345
elvis 123456
elvis password
elvis passwd
elvis test
elvis test123
elvis sivle

unix is secure. But only as secure as your passwords: Just came across this lame rootkit on some computer.
Above the passwords that it seems to try for all users it seems to be frequent enough. Pretty lame, but it seems to work.
If you think you are smart and have a password like ‘usermane’ then think again.

os x server: 10.4 and I still hate it

December 5th, 2005

File servers. No big deal. I am dealing with kind of thing since more than ten years. And it works.
We tried OS X Server 10.2.8 a while back, and it was bad. Now I have to deal with it again: OS X 10.4.3
and it still is junk. It is broken. Things don’t work as they should. Apples way of doing things is
incompatible with everything. It is such a waste of time. If they add a guy then they should leave
the way things are done underneath as everybody would expect them. XServers get bought by people
for their shiny facade. Which is all ok for me. Just that the inner workings of it are simply rotten.
The non server version of OS X is much more consistent with having all features in the system preferences
sharing.

it’s all gory details. I don’t even want to go into it. It was broken with 10.2.8. With 10.4.3 it is still broken.
Fileservers are not THAT important that you want to waste your entire worklife administrating them.
And with Linux (or even SGI for that matter) you don’t have to. You learn the meaning of a few commands and
are done with it.

It really is bulshit. If you consider to get a server, don’t get an Apple one. They are too expensive and work not in a way that would make any sense …

sploups?

November 3rd, 2005

Once in a while I watch how the BlogsNow bot crawls. Url’s running by. Today a yahoo Groups URL caught my eye. I didn’t know that yahoo uses ping services for these pages. I don’t think they do.
I asked BlogsNow for all yahoo Groups and got a list of 42,278 different ones. I clicked randomly on 5, and looked over the names of hundreds. I could not find a single legit one. The spam content that gets pushed via yahoo groups is the same than on spam blogs.
No need for examples.

blogdex where are though?

October 31st, 2005

Blogdex is still down. So I thought I might run some google adwords pointing people to BlogsNow.
Turned out somebody was faster: Right now I see an add for blogturbo dot com. Interesting what google advertises for:
It costs only 149 US$ and you can generate thousands of weblogs pointing to your site. This looks like a keyword spam tool to me.
Interesting that google runs ads for it.

Then I wonderred what is going on at daypop.com
Turns out they are down as well …

update November 1st
Blogdex: “up” again, yet results are old/pointless right now.
Daypop: back up again, results make sense. the usual 24 hour delay
blogturbo: still showing ads on google adwords for blogdex.

zombies

October 20th, 2005

and where they are coming from

ping poison

October 16th, 2005

BlogsNow gets seven pings a second. I just had a cursory look over those. Yes, they are all spam.
If you should still ping BlogsNow in good intention please stop doing so. If you ping BlogsNow in the future then your weblog will go on the black list. Sorry.

‘em crawling

September 29th, 2005

//wordpress/xmlrpc.php

putting a machine in the internet, and it does not take long until you get a request like the one above. There is a hole in Wordpress, and apparently there are robots out there trying to find those.

Update or move it to a name that is less predictable.

Andreas

god, they would not be that stupid

September 20th, 2005

of course they are

two hundred spammers?

June 20th, 2005

in this text somebody claims that 200 people are responsible for 90% of all spam on the world.

Could we find them a job? Please.

it’s the law

June 18th, 2005

forty million credit card numbers walked out of a building somewhere.

The CC companies said there was so far no more fraud activity “beyond the ordinary”.

Lately there has been a flurry of reports. As the linked article explains there is a new California state law that requires businesses to notify their customes when their personal information has been exposed in a security breach. That explains allot.

oops

June 7th, 2005

Citigroup looses tapes with the records
of 3.7 million clients. What _would_ have be cool would be if they would have not have copies.
That would teach them.
Right now all those 3.7 million people can hope for is that UPS lost the tapes real good. They can loose things so good that nobody can find them.
Really, only fedex and UPS can do that.

intel mac

June 7th, 2005

Finally the stack overflow exploits handcrafted for intel CPU might start work on Macs!

not the last time we will hear this

June 3rd, 2005

malware alliance
What would happen if the masses of recruited Windows PCs are able to impact bigger part
of the internets, so that outages will be noticebale for more people?

Think “Dr. Evil”. :

You want the internet back? That would be “one million dollars”.

Thanks Microsoft. I hope Bill is paying the ransom he and his OS have caused.

It’s not the internet that is vunerable, it is not the computers. It is the operating system called Windows made by Microsoft.
Technically all systems can have viruses. In reality only Microsoft Windows systems are part of these malware empires.

Since people tend to say different: This has nothing to do with market share. 10% Apple Systems is by far enough to be
attractive. In the webserver market Mirocroft products are the minority but still manage to host all the interesting exploits.

It’s a design problem, and a historical one. For Windows security the geenie is out of the bottle. Apple can afford to fix every problem that becomes known: Their virus count is zero. It is so much easier to go back to zero from one than from multiple thousand.

BlogsNow Version 2 and spam

May 28th, 2005

BlogsNow Version2 is coming along. Instead of moving code and data from Version1 over to this host I decided to write it again. Most changes go into spam detection and filtering.

Right BlogsNow Version 2 flags and ignores -

- spam:
http://midwesternerslavished.blogspot.com/
http://pet-insurance-tips.blogspot.com/
http://guitar-rock.blogspot.com/

- indecent content:
http://spaces.msn.com/members/teen-galleries/
http://spaces.msn.com/members/adult-creampies/

[I thought that spaces had such a tight content filter, apparently not]

- ‘blogs’ that forward directly to porn sites:
http://jasmine-disney-hentai.blogspot.com

There is an ever increasing amount of blogs that only were created for spam purposes.
Right now it looks as if BlogsNow can start crawling blogspot.com blogs again in Version2.

google finds adsense

May 25th, 2005

searching google for adsense

Right now this search returns a domain as the first result: www.all-in-one-business.com/adsense/
The real adsense page comes in only second.

Google directs a vast amount of internet traffic. Internet traffic can be made into money. One way or another. If you get it cheap enough there will be a profit. People only click on the first results they find. There is a wide rainbow of SEO (”Search Engine Optimization”) activities. From nice to criminal,
and everything in between.

One mean trick is it to hijack a page. Basically steal it. Google never really acknowledged the problem. Nor did they address it.
That’s why they are the victim of it themselves.

more details
first blog to report this flaw

OneCare

May 14th, 2005

From a Microsoft press release:

The dynamic nature of the Internet and technology can make the protection, maintenance and optimal performance of PCs a challenge for consumers. Keeping a PC "healthy" today can be daunting and time-consuming for the average user.

They forgot to add “if you run Windows”.

OS X has no need for ‘OneCare’, since the problem does not exist.

embracing spam

May 11th, 2005

Spammers use blogs. Massively. Spam is always massive.

BlogsNow has to deal with this. Sometimes this sucks, and sometimes it’s actually an interesting challenge: How quickly can the results be cleaned up. Today it was easy:

results before filtering

results after filtering