So you have somehow begged, borrowed or stolen an email list of 1000 users who you believe are interested in your new service. Would it not be great if you could somehow convert that list into real people, with real photos, and perhaps even more concrete information like “My service has a higher than average gay consumer group” or “My dating service seems to be very popular among 9 year old girls”? Such information can help you correct course before you are too invested in a particular idea you have.
Well, a few weeks back, we were handed down this lovely present by our masters from above: Facebook. Save your email list as a CSV file (just comma separate those email addresses). Upload this file to your facebook account as if you wanted to add them as friends. Voila, facebook will give you all the profiles of all those users (in my test, about 80% of my email lists have facebook profiles). Now, click through each profile, and because of the new default facebook settings, which makes all information public, about 95% of the user info is available for you to harvest.
If your email list is too large, then use the very same CSV file and upload it to mechanical turk (a list of 10.000 would cost you about $10), and ask the mechanical turk guys to gather this information for you.
After you have all the demographic information you want, try to do good with it. My personal advice to facebook users: Switch on your privacy settings, make your friendslist private. Business want this information, and facebook has given it to them.
Update (from a reddit comment): Use this URL http://www.facebook.com/search/?ref=ffs&q=name@domain.com&o=2048&… and screenscrape for even more spammy goodness.
If you're interested in technology & startups, then follow me on my low volume twitter account





Wow. It’s a little scary how easy it has become to harvest info about complete strangers.
Hmm, I thought Facebook disallowed the ability to make our friendlist private last month…
You can still make your friendlist private, it’s just a bit more obscure now.
Hmm, as a Statistics student this is a flawed method of gathering information. This method presents a sampling bias, as the younger demographic is more likely to have a Facebook account than the elderly. Thus, this method likely underestimates age (and probably gender, ethnicity, and orientation).
Yeah, but if it represents 80% of the total users, then I'd say it's still pretty darned representative, even if the other 20% was *completely* opposed data.
brilliant
The world can advance to much for it’s own good sometimes I guess@ http://defylogic.podbean.com
I thought this was for personal pages only? Does this work on business and fan pages?
@retrogamerUnfortunately, it is scarier than that. This isn’t gathering statistical likelihoods. This is getting the real data from specific people. Some would be public in the profiles (e.g. locality, name, relationship status, groups you’re in…). Some they have gathered by you or other people running a facebook application. If you run an application (e.g. a poll, survey, quiz), it has the power to see all the information you can see about your friends. (This is so ethically wrong, IMHO). So in the process, facebook gets more and more information about you.
I just re-read part of this. When it says, “facebook will give you all the profiles of all those users,” I misinterpreted (I think) the use of the word “profile”. It actually means the “Profile” as defined by facebook. It is the page where you can enter all kinds of information about yourself. You can opt out of showing some of it to everyone (not name, location, friends, groups), but the default is that it’s public to everyone. So this method is allowing you access to the “facebook profile”, not “profile” in it’s normal usage.
@Max KleinWhat is the procedure for making your friend list private? All I can find in the privacy settings is that the friend list is part of your “publicly available information,” which you apparently can’t change.
A twist on this is storing a copy of your email list in a Gmail/free account and then using the find a friend feature on many social networks like FB, LinkedIn etc.
Yet another proof that massive centralization of personal data is plain evil. When will we see a genuine social *network*? (No, Facebook does not count: they are a social *server*.)
This isn’t a surprise….have you ever created an ad? You would know that this info is available……so what……Maybe some people don’t realize this….that’s THEIR fault…..rule of thumb….don’t put anything on the internet that you don’t want public……Read your USER AGREEMENT!!!! it’s all in there….DUMBASSES!!!
9 year old girls don’t need to be “dating.”
@Loup Vaillant – “When will we see a genuine social network?” – your using it right now, it’s called the world wide web
@BenFranklin1982Make your friend list private by going to your Profile page, find the Friends box on the left side, click the little pencil icon next to “Friends” in the header, and uncheck the “Show Friend List to everyone” box.
This information has been mined for years. The “Answer these 10 questions and forward to all of your friends” snowball farms were early version. I second the thought that you “don’t put anything online that you don’t want public”.There are massive databases that exist today simply gathering all of the information they can about people, their histories, connections, personal info, etc… This data is of incalculable value to businesses in the future. If you don’t like the idea of your information being so available, you have a big task ahead of you to stop it.The sad part is that you realistically have less and less control over your data as time moves forward. One mistake by one agency that shouldn’t have been trusted with your data is all it will take. Once you’re in the database, you’re in there for life.
One other area that concerns me is the invitation model for these services. GMail, for example, is only available via invitation. If you were to trace it back you could create a graph connecting every email address to the account that invited it. Connect this with your “profile” information and you can build a network of “who invited who” all the way back to the root.You can be 100% certain Google are aware of the power they hold with that. It wasn’t coincidental that their design gathered this information trail.
Wow – the new FB “privacy rules” get worse and worse … I wonder if/when they finally get called sufficiently out on all this to reset and restart.
Collection of information from users requires their consent under Facebook’s Statement of Rights and Responsibilities and we may disable accounts of those found in violation. In addition, we’ve developed several systems to detect and block malicious use of the Friend Finder. For example, we don’t allow users to upload contact lists past a certain size. We also block users who upload contacts at an anomalous rate. We’re always working to improve these systems and others that help protect the privacy and security of our users’ information. Finally, the Friend Finder and data collection restrictions are not new and information the blog post suggests can be obtained either is not something Facebook collects (e.g. ethnicity) or is not available to non-friends by default (e.g. age and sexual orientation). However, we encourage people with concerns to configure their privacy settings appropriately.
I used Facebook several times to get the real names of people I had just the email address. I found it quite practical.But more than just getting information about the person behind an email address… you could just verify the address in a email list for spammers. And this is more annoying.
Hi, just these 2 points:- combining demographic data collected from Social Networking Sites with email addresses is not new. In fact, Rapleaf has collected hundreds of millions of these profiles, either by scraping or by bulk data exchange with SNS:http://news.zdnet.com/2100-9588_22-6205716.htmlhttp://blog.rapleaf.com/database-milestones/- Most SNSes allow email address import or searching for email addresses although users assume their email address is private. Facebook attracts more attention now, because it has more data, and has made more data available. But also “respectable” sites like Flickr use the same technique:http://www.readwriteweb.com/archives/flickr_friends_data_portability_or_privacy_violation.php
@Pascal Van HeckeThanks for sharing those links, very informative. Someone on HN also linked out (via the comments) to their related post, maybe your interested: http://petewarden.typepad.com/searchbrowser/2009/12/what-can-i-find-out-about-you-if-i-know-your-email-address.html
Hi Chris, the service Pete Warden has built, is really illustrative!
Collecting data is really a hard work. It just doesnot involve mere collection, but also sorting, arranging & storing. If you are collecting email addresses, or mobile numbers then not necessary that they will remain in use even after 6 months. That is why, it is very crucial to get your hands on right kind of the data.
So far, so good, we’ll see even more intelligent spammers in the future
It was a while since you wrote your post, but there are services around this now (which go far beyond Facebook) and it has even been built into Outlook 2010. I talked about a few of these services at http://robinteractive.wordpress.com/2010/03/31/email-social-media-convergence…There's a LOT of info revealed by a simple e-mail address!
[...] How to modify email addresses in to name, age, ethnicity, passionate … [...]
I completely agree with the above comment, the web is without any doughtgrowing into the most important medium of communication across the globe and its due to websites like this that information is spreading so quickly.
I thought it was going to be some boring old post, but it really compensated for my time. I will post a link to this page on my blog. I am sure my visitors will find that very useful.
,-’ I am really thankful to this topic because it really gives useful information .:-
`;: that seems to be a great topic, i really love it *,~
this is really special. place
I gotta agree, this site is simply beautiful in all things business. Thanks Max!
I do trust all of the ideas you’ve offered in your post. They are really convincing and can definitely work. Nonetheless, the posts are too quick for beginners. May just you please prolong them a bit from next time? Thank you for the post.