Social profiling is the process of constructing a
social media
Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
user's profile using his or her
social data. In general,
profiling refers to the
data science
Data science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processing, scientific visualization, algorithms and systems to extract or extrapolate knowledge from potentially noisy, stru ...
process of generating a person's profile with computerized
algorithm
In mathematics and computer science, an algorithm () is a finite sequence of Rigour#Mathematics, mathematically rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algo ...
s and technology. There are various platforms for sharing this information with the proliferation of growing popular
social network
A social network is a social structure consisting of a set of social actors (such as individuals or organizations), networks of Dyad (sociology), dyadic ties, and other Social relation, social interactions between actors. The social network per ...
s, including but not limited to
LinkedIn
LinkedIn () is an American business and employment-oriented Social networking service, social network. It was launched on May 5, 2003 by Reid Hoffman and Eric Ly. Since December 2016, LinkedIn has been a wholly owned subsidiary of Microsoft. ...
,
Google+
Google+ (sometimes written as Google Plus, stylized as G+ or g+) was a Social networking service, social network owned and operated by Google until it ceased operations in 2019. The network was launched on June 28, 2011, in an attempt to challe ...
,
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
and
Twitter
Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
.
Social profile and social data
A person's
social data refers to the personal data that they generate either online or offline (for more information, see
social data revolution). A large amount of these data, including one's language, location and interest, is shared through
social media
Social media are interactive technologies that facilitate the Content creation, creation, information exchange, sharing and news aggregator, aggregation of Content (media), content (such as ideas, interests, and other forms of expression) amongs ...
and
social network
A social network is a social structure consisting of a set of social actors (such as individuals or organizations), networks of Dyad (sociology), dyadic ties, and other Social relation, social interactions between actors. The social network per ...
. Users join multiple social media platforms and their profiles across these platforms can be linked using different methods to obtain their interests, locations, content, and friend list. Altogether, this information can be used to construct a person's social profile.
Meeting the user's satisfaction level for information collection is becoming more challenging. This is because of too much "noise" generated, which affects the process of information collection due to explosively increasing online data. Social profiling is an emerging approach to overcome the challenges faced in meeting user's demands by introducing the concept of
personalized search
Personalized search is a web search tailored specifically to an individual's interests by incorporating information about the individual beyond the specific query provided. There are two general approaches to Personalization, personalizing search ...
while keeping in consideration user profiles generated using social network data. A study reviews and classifies research inferring users social profile attributes from social media data as individual and group profiling. The existing techniques along with utilized data sources, the limitations, and challenges were highlighted.
The prominent approaches adopted include
machine learning
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of Computational statistics, statistical algorithms that can learn from data and generalise to unseen data, and thus perform Task ( ...
,
ontology, and
fuzzy logic
Fuzzy logic is a form of many-valued logic in which the truth value of variables may be any real number between 0 and 1. It is employed to handle the concept of partial truth, where the truth value may range between completely true and completely ...
. Social media data from
Twitter
Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
and
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
have been used by most of the studies to infer the social attributes of users. The literature showed that user social attributes, including age, gender, home location, wellness, emotion, opinion, relation, influence are still need to be explored.
Personalized meta-search engines
The ever-increasing online content has resulted in the lack of proficiency of centralized
search engine
A search engine is a software system that provides hyperlinks to web pages, and other relevant information on World Wide Web, the Web in response to a user's web query, query. The user enters a query in a web browser or a mobile app, and the sea ...
's results.
It can no longer satisfy user's demand for information. A possible solution that would increase coverage of search results would be
meta-search engines,
an approach that collects information from numerous centralized search engines. A new problem thus emerges, that is too much data and too much noise is generated in the collection process.
Therefore, a new technique called personalized meta-search engines was developed. It makes use of a user's profile (largely social profile) to filter the search results. A user's profile can be a combination of a number of things, including but not limited to, "a user's manual selected interests, user's search history", and personal social network data.
Social media profiling
According to
Samuel D. Warren II and
Louis Brandeis
Louis Dembitz Brandeis ( ; November 13, 1856 – October 5, 1941) was an American lawyer who served as an Associate Justice of the Supreme Court of the United States, associate justice on the Supreme Court of the United States from 1916 to ...
(1890), disclosure of private information and the misuse of it can hurt people's feelings and cause considerable damage in people's lives. Social networks provide people access to intimate online interactions; therefore, information access control, information transactions,
privacy issues, connections and relationships on social media have become important research fields and are subjects of concern to the public.
Ricard Fogues and other co-authors state that "any privacy mechanism has at its base an access control", that dictate "how
permissions are given, what elements can be private, how access rules are defined, and so on".
Current access control for social media accounts tend to still be very simplistic: there is very limited diversity in the category of relationships on for social network accounts. User's relationships to others are, on most platforms, only categorized as "friend" or "non-friend" and people may leak important information to "friends" inside their social circle but not necessarily users to they consciously want to share the information to.
The below section is concerned with social media profiling and what profiling information on social media accounts can achieve.
Privacy leaks
A lot of information is voluntarily shared on online social networks, such as photos and updates on life activities (new job, hobbies, etc.). People rest assured that different social network accounts on different platforms will not be linked as long as they do not grant permission to these links. However, according to Diane Gan, information gathered online enables "target subjects to be identified on other social networking sites such as Foursquare, Instagram, LinkedIn, Facebook and Google+, where more personal information was leaked".
The majority of social networking platforms use the "opt out approach" for their features. If users wish to protect their privacy, it is user's own responsibility to check and change the
privacy settings
Privacy settings are the part of a social networking website, web browser, or other piece of software, that allows a user to control who sees information about the user. With the growing prevalence of social networking services, opportunities for p ...
as a number of them are set to default option.
A major social network platforms have developed geo-tag functions and are in popular usage. This is concerning because 39% of users have experienced profiling hacking; 78% burglars have used major social media networks and Google Street-view to select their victims; and an astonishing 54% of burglars attempted to break into empty houses when people posted their status updates and geo-locations.
Facebook
Formation and maintenance of social media accounts and their relationships with other accounts are associated with various social outcomes. In 2015, for many firms,
customer relationship management
Customer relationship management (CRM) is a strategic process that organizations use to manage, analyze, and improve their interactions with customers. By leveraging data-driven insights, CRM helps businesses optimize communication, enhance cus ...
is essential and is partially done through
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
.
Before the emergence and prevalence of social media, customer identification was primarily based upon information that a firm could directly acquire: for example, it may be through a customer's purchasing process or voluntary act of completing a
survey/loyalty program. However, the rise of social media has greatly reduced the approach of building a
customer's profile/model based on available data. Marketers now increasingly seek customer information through Facebook;
this may include a variety of information users disclose to all users or partial users on Facebook: name, gender, date of birth, e-mail address, sexual orientation, marital status, interests, hobbies, favorite sports team(s), favorite athlete(s), or favorite music, and more importantly, Facebook connections.
However, due to the privacy policy design, acquiring true information on Facebook is no trivial task. Often, Facebook users either refuse to disclose true information (sometimes using pseudonyms) or setting information to be only visible to friends, Facebook users who "LIKE" your page are also hard to identify. To do online profiling of users and cluster users, marketers and companies can and will access the following kinds of data: gender, the IP address and city of each user through the Facebook Insight page, who "LIKED" a certain user, a page list of all the pages that a person "LIKED" (
transaction data), other people that a user follow (even if it exceeds the first 500, which we usually can not see) and all the publicly shared data.
Twitter
First launched on the Internet in March 2006, Twitter is a platform on which users can connect and communicate with any other user in just 280 characters.
Like Facebook,
Twitter
Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
is also a crucial tunnel for users to leak important information, often unconsciously, but able to be accessed and collected by others.
According to
Rachel Nuwer, in a sample of 10.8 million tweets by more than 5,000 users, their posted and publicly shared information are enough to reveal a user's income range.
A postdoctoral researcher from the
University of Pennsylvania
The University of Pennsylvania (Penn or UPenn) is a Private university, private Ivy League research university in Philadelphia, Pennsylvania, United States. One of nine colonial colleges, it was chartered in 1755 through the efforts of f ...
, Daniel Preoţiuc-Pietro and his colleagues were able to categorize 90% of users into corresponding income groups. Their existing collected data, after being fed into a machine-learning model, generated reliable predictions on the characteristics of each income group.
The mobile app called Streamd.in displays live tweets on Google Maps by using geo-location details attached to the tweet, and traces the user's movement in the real world.
Profiling photos on social network
The advent and universality of social media networks have boosted the role of images and visual information dissemination.
Many types of visual information on social media transmit messages from the author, location information and other personal information. For example, a user may post a photo of themselves in which landmarks are visible, which can enable other users to determine where they are. In a study done by Cristina Segalin, Dong Seon Cheng and Marco Cristani, they found that profiling user posts' photos can reveal personal traits such as personality and mood.
In the study, convolutional neural networks (CNNs) is introduced. It builds on the main characteristics of computational aesthetics CA (emphasizing "computational methods", "human aesthetic point of view", and "the need to focus on objective approaches"
) defined by Hoenig (Hoenig, 2005). This tool can extract and identify content in photos.
Tags
In a study called "A Rule-Based Flickr Tag Recommendation System", the author suggests personalized tag recommendations,
largely based on user profiles and other web resources. It has proven to be useful in many aspects: "web content indexing", "multimedia data retrieval", and enterprise Web searches.
Delicious
Flickr
Zooomr
Marketing
In 2011, marketers and retailers are increasing their market presence by creating their own pages on social media, on which they post information, ask people to like and share to enter into contests, and much more. Studies in 2011 show that on average a person spends about 23 minutes on a social networking site per day. Therefore, companies from small to large ones are investing in gathering user behavior information, rating, reviews, and more.
Facebook
Until 2006, communications online are not content led in terms of the amount of time people spend online. However, content sharing and creating has been the primary online activity of general social media users and that has forever changed online marketing. In the book Advanced Social media Marketing,
the author gives an example of how a New York wedding planner might identify his audience when marketing on Facebook. Some of these categories may include: (1) who live in the United States; (2) Who live within 50 miles of New York; (3) Age 21 and older; (4) engaged female.
No matter you choose to pay cost per click or cost per impressions/views "the cost of Facebook Marketplace ads and Sponsored Stories is set by your maximum bid and the competition for the same audiences".
The cost of clicks is usually $0.5–1.5 each.
Tools
Klout
Klout is a popular online tool that focuses on assessing a user's
social influence
Social influence comprises the ways in which individuals adjust their behavior to meet the demands of a social environment. It takes many forms and can be seen in conformity, socialization, peer pressure, obedience (human behavior), obedience, le ...
by social profiling. It takes several social media platforms (such as
Facebook
Facebook is a social media and social networking service owned by the American technology conglomerate Meta Platforms, Meta. Created in 2004 by Mark Zuckerberg with four other Harvard College students and roommates, Eduardo Saverin, Andre ...
,
Twitter
Twitter, officially known as X since 2023, is an American microblogging and social networking service. It is one of the world's largest social media platforms and one of the most-visited websites. Users can share short text messages, image ...
etc.) and numerous aspects into account and generate a user's score from 1 to 100. Regardless of one's number of likes for a post, or connections on LinkedIn, social media contains plentiful personal information. Klout generates a single score that indicates a person's influence.
In a study called "''How Much Klout do You Have...A Test of System Generated Cues on Source Credibility''" done by Chad Edwards, Klout scores can influence people's perceived credibility. As Klout Score becomes a popular combined-into-one-score method of accessing people's influence, it can be a convenient tool and a biased one at the same time. A study of how social media followers influence people's judgments done by David Westerman illustrates that possible bias that Klout may contain.
In one study, participants were asked to view six identical mock Twitter pages with only one major independent variable: page followers. Result shows that pages with too many or too fewer followers would both decrease its credibility, despite its similar content. Klout score may be subject to the same bias as well.
While this is sometimes used during recruitment process, it remains to be controversial.
Kred
Kred not only assigns each user an influence score, but also allows each user to claim a Kred profile and Kred account. Through this platform, each user can view how top
influencers
A social media influencer, or simply influencer (also known as an online influencer), is a person who builds a grassroots online presence through engaging content such as photos, videos, and updates. This is done by using direct audience intera ...
engage with their online community and how each of your online action impacted your influence scores.

Several suggestions that Kred is giving to the audience about increasing influence are: (1) be generous with your audience, free comfortable sharing content from your friends and tweeting others; (2) join an online community; (3) create and share meaningful content; (4) track your progress online.
Follower Wonk
Follower Wonk is specifically targeted towards Twitter analytics, which helps users to understand follower demographics, and optimizes your activities to find which activity attracts the most positive feedback from followers.
Keyhole
Keyhole is a hashtag tracking and analytics device that tracks Instagram, Twitter and Facebook hashtag data. It is a service that allows you to track which top influencer is using a certain hashtag and what are the other demographic information about the hashtag. When you enter a hashtag on its website, it will automatically randomly sample users that currently used this tag which allows user to analyze each hashtag they are interested in.
Online activist social profile
The prevalence of the Internet and social media has provided
online activists both a new platform for activism, and the most popular tool. While online activism might stir up great controversy and trend, few people actually participate or sacrifice for relevant events. It becomes an interesting topic to analyse the profile of online activists. In a study done by Harp and his co-authors about online activist in China, Latin America and United States, the majority of online activists are males in Latin America and China with a median income of $10,000 or less, while the majority of online activist is female in
United States
The United States of America (USA), also known as the United States (U.S.) or America, is a country primarily located in North America. It is a federal republic of 50 U.S. state, states and a federal capital district, Washington, D.C. The 48 ...
with a median income of $30,000 - $69,999; and the education level of online activists in the United States tend to be postgraduate work/education while activists in other countries have lower education levels.
A closer examination of their online shared content shows that the most shared information online include five types:
# To
fundraise: Out of the three countries, China's activists have the most content on fundraise out of the three.
# To post links: Latin American activists have does the most on posting links.
# To promote debate or Discussion: Both Latin America's and China's activists posts more contents to promote debate or discussion than American activists do.
# To post information such as announcements and news: American activists post more such content than the activists from other countries.
# To communicate with Journalist: In this section, China's activists gets the lead.
Social credit score in China
The Chinese government hopes to establish a "
social-credit system" that aims to score "financial creditworthiness of citizens", social behavior and even political behaviour.
This system will be combining big data and social profiling technologies. According to Celia Hatton from
BBC News
BBC News is an operational business division of the British Broadcasting Corporation (BBC) responsible for the gathering and broadcasting of news and current affairs in the UK and around the world. The department is the world's largest broad ...
, everyone in China will be expected to enroll in a
national database that includes and automatically calculates fiscal information, political behavior, social behavior and daily life including minor traffic violations – a single score that evaluates a citizen's trustworthiness.
Credibility scores, social influence scores and other comprehensive evaluations of people are not rare in other countries. However, China's "social-credit system" remains to be controversial as this single score can be a reflection of a person's every aspect.
Indeed, "much about the social-credit system remains unclear".
How would companies be limited by credit score system in China?
Although the implementation of social
credit score
A credit score is a numerical expression based on a level analysis of a person's credit files, to represent the creditworthiness of an individual. A credit score is primarily based on a credit report, information typically sourced from credit bu ...
remains controversial in China, Chinese government aims to fully implement this system by 2018.
According to Jake Laband (the deputy director of the Beijing office of the US-China Business Council), low credit scores will "limit eligibility for financing, employment, and Party membership, as well restrict real estate transactions and travel." Social credit score will not only be affected by legal criteria, but also social criteria, such as contract breaking. However, this has been a great concern for privacy for big companies due to the huge amount of data that will be analyzed by the system.
See also
*
Account verification
*
Digital identity
A digital identity is data stored on Computer, computer systems relating to an individual, organization, application, or device. For individuals, it involves the collection of personal data that is essential for facilitating automated access to ...
*
Online identity
Internet identity (IID), also online identity, online personality, online persona or internet persona, is a social identity that an Internet user establishes in online communities and websites. It may also be an actively constructed presentatio ...
*
Online identity management
*
Online presence management
*
Online reputation
*
Persona (user experience)
A persona (also user persona, user personality, customer persona, buyer persona) in user-centered design and marketing is a semi-fictional characterization or representation of a typical customer segment or end user. Personas help marketers and ...
*
Personal information
Personal data, also known as personal information or personally identifiable information (PII), is any information related to an identifiable person.
The abbreviation PII is widely used in the United States, but the phrase it abbreviates has fou ...
*
Personal identity
Personal identity is the unique numerical identity of a person over time. Discussions regarding personal identity typically aim to determine the necessary and sufficient conditions under which a person at one time and a person at another time ...
*
Real-name system
A real-name system is a system in which users can register an account on a blog, website or bulletin board system using their legal name.
Users are required to provide identification credentials and their legal name. A public pseudonym can also b ...
*
Reputation management
Reputation management, refers to the Social influence, influencing, controlling, enhancing, or concealing of an individual's or group's reputation. It is a marketing technique used to modify a person's or a company's reputation in a positive way. ...
*
Reputation system
A reputation system is a program or algorithm that allow users of an online community to rate each other in order to build trust (social sciences), trust through reputation. Some common uses of these systems can be found on E-commerce websites s ...
*
Social media optimization
Social media optimization (SMO) is the use of online platforms to generate income or publicity to increase the awareness of a brand, event, product or service. Types of social media involved include RSS feeds, blogging sites, social bookmarkin ...
*
User profile
A user profile is a collection of settings and information associated with a user. It contains critical information that is used to identify an individual, such as their name, age, portrait photograph and individual characteristics such as kn ...
References
{{Social media
Data mining
Identity management
Social information processing
Social media
Social networks