Developing an agent-based model to minimize spreading of malicious information in dynamic social networks
This research introduces a systematic and multidisciplinary agent-based model to interpret and simplify the dynamic actions of the users and communities in an evolutionary online (offline) social network. The organizational cybernetics approach is used to control/monitor the malicious information spread between communities. The stochastic one-median problem minimizes the agent response time and eliminates the information spread across the online (offline) environment. The performance of these methods was measured against a Twitter network related to an armed protest demonstration against the COVID-19 lockdown in Michigan state in May 2020. The proposed model demonstrated the dynamicity of the network, enhanced the agent level performance, minimized the malicious information spread, and measured the response to the second stochastic information spread in the network.
280 characters to the White House: predicting 2020 U.S. presidential elections from twitter data
This nation-shaping election of 2020 plays a vital role in shaping the future of the U.S. and the entire world. With the growing importance of social media, the public uses them to express their thoughts and communicate with others. Social media have been used for political campaigns and election activities, especially Twitter. The researchers intend to predict presidential election results by analyzing the public stance toward the candidates using Twitter data. Previous researchers have not succeeded in finding a model that simulates well the U.S. presidential election system. This manuscript proposes an efficient model that predicts the 2020 U.S. presidential election from geo-located tweets by leveraging the sentiment analysis potential, multinomial naive Bayes classifier, and machine learning. An extensive study is performed for all 50 states to predict the 2020 U.S. presidential election results led by the state-based public stance for electoral votes. The general public stance is also predicted for popular votes. The true public stance is preserved by eliminating all outliers and removing suspicious tweets generated by bots and agents recruited for manipulating the election. The pre-election and post-election public stances are also studied with their time and space variations. The influencers' effect on the public stance was discussed. Network analysis and community detection techniques were performed to detect any hidden patterns. An algorithm-defined stance meter decision rule was introduced to predict Joe Biden as the President-elect. The model's effectiveness in predicting the election results for each state was validated by the comparison of the predicted results with the actual election results. With a percentage of 89.9%, the proposed model showed that Joe Biden dominated the electoral college and became the winner of the U.S. presidential election in 2020.
Coordinating Narratives Framework for cross-platform analysis in the 2021 US Capitol riots
Coordinated disinformation campaigns are used to influence social media users, potentially leading to offline violence. In this study, we introduce a general methodology to uncover coordinated messaging through an analysis of user posts on Parler. The proposed Coordinating Narratives Framework constructs a user-to-user coordination graph, which is induced by a user-to-text graph and a text-to-text similarity graph. The text-to-text graph is constructed based on the textual similarity of Parler and Twitter posts. We study three influential groups of users in the 6 January 2020 Capitol riots and detect networks of coordinated user clusters that post similar textual content in support of disinformation narratives related to the U.S. 2020 elections. We further extend our methodology to Twitter tweets to identify authors that share the same disinformation messaging as the aforementioned Parler user groups.
Fake or not? Automated detection of COVID-19 misinformation and disinformation in social networks and digital media
With the continuous spread of the COVID-19 pandemic, misinformation poses serious threats and concerns. COVID-19-related misinformation integrates a mixture of health aspects along with news and political misinformation. This mixture complicates the ability to judge whether a claim related to COVID-19 is information, misinformation, or disinformation. With no standard terminology in information and disinformation, integrating different datasets and using existing classification models can be impractical. To deal with these issues, we aggregated several COVID-19 misinformation datasets and compared differences between learning models from individual datasets versus one that was aggregated. We also evaluated the impact of using several word- and sentence-embedding models and transformers on the performance of classification models. We observed that whereas word-embedding models showed improvements in all evaluated classification models, the improvement level varied among the different classifiers. Although our work was focused on COVID-19 misinformation detection, a similar approach can be applied to myriad other topics, such as the recent Russian invasion of Ukraine.
Vaccination trials on hold: malicious and low credibility content on Twitter during the AstraZeneca COVID-19 vaccine development
The development of COVID-19 vaccines during the global pandemic that started in 2020 was marked by uncertainty and misinformation reflected also on social media. This paper provides a quantitative evaluation of the Uniform Resource Locators (URLs) shared on Twitter around the clinical trials of the AstraZeneca vaccine and their temporary interruption in September 2020. We analyzed URLs cited in Twitter messages before and after the temporary interruption of the vaccine development on September 9, 2020 to investigate the presence of low credibility and malicious information. We show that the halt of the AstraZeneca clinical trials prompted tweets that cast doubt, fear and vaccine opposition. We discovered a strong presence of URLs from low credibility or malicious websites, as classified by independent fact-checking organizations or identified by web hosting infrastructure features. Moreover, we identified what appears to be coordinated operations to artificially promote some of these URLs hosted on malicious websites.
Social distance "nudge:" a context aware mHealth intervention in response to COVID pandemics
The impact of the COVID pandemic to our society is unprecedented in our time. As coronavirus mutates, maintaining social distance remains an essential step in defending personal as well as public health. This study conceptualizes the social distance "nudge" and explores the efficacy of mHealth digital intervention, while developing and validating a choice architecture that aims to influence users' behavior in maintaining social distance for their own self-interest. End-user nudging experiments were conducted via a mobile phone app that was developed as a research artifact. The accuracy of social distance nudging was validated in both United States and Japan. Future work will consider behavioral studies to better understand the effectiveness of this digital nudging intervention.
Differences between antisemitic and non-antisemitic English language tweets
Antisemitism is a global phenomenon on the rise that is negatively affecting Jews and communities more broadly. It has been argued that social media has opened up new opportunities for antisemites to disseminate material and organize. It is, therefore, necessary to get a picture of the scope and nature of antisemitism on social media. However, identifying antisemitic messages in large datasets is not trivial and more work is needed in this area. In this paper, we present and describe an annotated dataset that can be used to train tweet classifiers. We first explain how we created our dataset and approached identifying antisemitic content by experts. We then describe the annotated data, where 11% of conversations about Jews (January 2019-August 2020) and 13% of conversations about Israel (January-August 2020) were labeled antisemitic. Another important finding concerns lexical differences across queries and labels. We find that antisemitic content often relates to conspiracies of Jewish global dominance, the Middle East conflict, and the Holocaust.
Editorial of the Special Issue from WorldCIST'20
Measuring the impact of suspending Umrah, a global mass gathering in Saudi Arabia on the COVID-19 pandemic
Since the early days of the coronavirus (COVID-19) outbreak in Wuhan, China, Saudi Arabia started to implement several preventative measures starting with the imposition of travel restrictions to and from China. Due to the rapid spread of COVID-19, and with the first confirmed case in Saudi Arabia in March 2019, more strict measures, such as international travel restriction, and suspension or cancellation of major events, social gatherings, prayers at mosques, and sports competitions, were employed. These non-pharmaceutical interventions aim to reduce the extent of the epidemic due to the implications of international travel and mass gatherings on the increase in the number of new cases locally and globally. Since this ongoing outbreak is the first of its kind in the modern world, the impact of suspending mass gatherings on the outbreak is unknown and difficult to measure. We use a stratified SEIR epidemic model to evaluate the impact of Umrah, a global Muslim pilgrimage to Mecca, on the spread of the COVID-19 pandemic during the month of Ramadan, the peak of the Umrah season. The analyses shown in the paper provide insights into the effects of global mass gatherings such as Hajj and Umrah on the progression of the COVID-19 pandemic locally and globally.
Computational simulation of the COVID-19 epidemic with the SEIR stochastic model
A small number of individuals infected within a community can lead to the rapid spread of the disease throughout that community, leading to an epidemic outbreak. This is even more true for highly contagious diseases such as COVID-19, known to be caused by the new coronavirus SARS-CoV-2. Mathematical models of epidemics allow estimating several impacts on the population and, therefore, are of great use for the definition of public health policies. Some of these measures include the isolation of the infected (also known as quarantine), and the vaccination of the susceptible. In a possible scenario in which a vaccine is available, but with limited access, it is necessary to quantify the levels of vaccination to be applied, taking into account the continued application of preventive measures. This work concerns the simulation of the spread of the COVID-19 disease in a community by applying the Monte Carlo method to a Susceptible-Exposed-Infective-Recovered (SEIR) stochastic epidemic model. To handle the computational effort involved, a simple parallelization approach was adopted and deployed in a small HPC cluster. The developed computational method allows to realistically simulate the spread of COVID-19 in a medium-sized community and to study the effect of preventive measures such as quarantine and vaccination. The results show that an effective combination of vaccination with quarantine can prevent the appearance of major epidemic outbreaks, even if the critical vaccination coverage is not reached.
To illuminate and motivate: A fuzzy-trace model of the spread of information online
We propose, and test, a model of online media platform users' decisions to act on, and share, received information. Specifically, we focus on how of message content drive its spread. Our model is based on Fuzzy-Trace Theory (FTT), a leading theory of decision under risk. Per FTT, online content is mentally represented in two ways: verbatim (objective, but decontextualized, facts), and gist (subjective, but meaningful, interpretation). Although encoded in parallel, gist tends to drive behaviors more strongly than verbatim representations for most individuals. Our model uses factors derived from FTT to make predictions regarding which content is more likely to be shared, namely: a) different levels of mental representation, b) the motivational content of a message, c) difficulty of information processing (e.g., the ease with which a given message may be comprehended and, therefore, its gist extracted), and d) social values.
"The coronavirus is a bioweapon": classifying coronavirus stories on fact-checking sites
The 2020 coronavirus pandemic has heightened the need to flag coronavirus-related misinformation, and fact-checking groups have taken to verifying misinformation on the Internet. We explore stories reported by fact-checking groups PolitiFact, Poynter and Snopes from January to June 2020. We characterise these stories into six clusters, then analyse temporal trends of story validity and the level of agreement across sites. The sites present the same stories 78% of the time, with the highest agreement between Poynter and PolitiFact. We further break down the story clusters into more granular story types by proposing a unique automated method, which can be used to classify diverse story sources in both fact-checked stories and tweets. Our results show story type classification performs best when trained on the same medium, with contextualised BERT vector representations outperforming a Bag-Of-Words classifier.
Active, aggressive, but to little avail: characterizing bot activity during the 2020 Singaporean elections
Digital disinformation presents a challenging problem for democracies worldwide, especially in times of crisis like the COVID-19 pandemic. In countries like Singapore, legislative efforts to quell fake news constitute relatively new and understudied contexts for understanding local information operations. This paper presents a social cybersecurity analysis of the 2020 Singaporean elections, which took place at the height of the pandemic and after the recent passage of an anti-fake news law. Harnessing a dataset of 240,000 tweets about the elections, we found that 26.99% of participating accounts were likely to be bots, responsible for a larger proportion of bot tweets than the election in 2015. Textual analysis further showed that the detected bots used simpler and more abusive second-person language, as well as hashtags related to COVID-19 and voter activity-pointing to aggressive tactics potentially fuelling online hostility and questioning the legitimacy of the polls. Finally, bots were associated with larger, less dense, and less echo chamber-like communities, suggesting efforts to participate in larger, mainstream conversations. However, despite their distinct narrative and network maneuvers, bots generally did not hold significant influence throughout the social network. Hence, although intersecting concerns of political conflict during a global pandemic may promptly raise the possibility of online interference, we quantify both the efforts and limits of bot-fueled disinformation in the 2020 Singaporean elections. We conclude with several implications for digital disinformation in times of crisis, in the Asia-Pacific and beyond.
The effect of ICT and higher-order capabilities on the performance of Ibero-American SMEs
Information and communication technologies (ICT) has the ability to create value by enabling other firm capabilities. Based on the ICT-enabled capabilities perspective, this study explores the direct and indirect effects between lower- and higher-order capabilities, such as ICT, knowledge management capability (KM) and product innovation flexibility (PIF), on the performance of Ibero-American small- and medium-sized enterprises (SMEs). This paper uses second-order structural equation models to test the research hypotheses with a sample of 130 Ibero-American SMEs. The results contribute to filling the gap in the SME-focused literature on empirical studies examining ICT-enabled capabilities and firm performance. The results show an enabling effect of ICT on higher-order capabilities, such as KM and PIF, which, by acting as mediating variables, create value and improve performance through innovation in firms.
Disinformation: analysis and identification
We present an extensive study on disinformation, which is defined as information that is false and misleading and intentionally shared to cause harm. Through this work, we aim to answer the following questions:Can we automatically and accurately classify a news article as containing disinformation?What characteristics of disinformation differentiate it from other types of benign information? We conduct this study in the context of two significant events: the US elections of 2016 and the 2020 COVID pandemic. We build a series of classifiers to (i) examine linguistic clues exhibited by different types of fake news articles, (ii) analyze "clickbaityness" of disinformation headlines, and (iii) finally, perform fine-grained, veracity-based article classification through a natural language inference (NLI) module for automated disinformation verification; this utilizes a manually curated set of evidence sources. For the latter, we built a new dataset that is annotated with generic, veracity-based labels and ground truth evidence supporting each label. The veracity labels were formulated based on examining standards used by reputable fact-checking organizations. We show that disinformation derives features from both propaganda and mainstream news, making it more challenging to detect. However, there is significant potential for automating the fact-checking process to incorporate the degree of veracity. We provide error analysis that illustrates the challenges involved in the automated fact-checking task and identifies factors that may improve this process in future work. Finally, we also describe the implementation of a web app that extracts important entities and actions from a given article and searches the web to gather evidence from credible sources. The evidence articles are then used to generate a veracity label that can assist manual fact-checkers engaged in combating disinformation.
ReOpen demands as public health threat: a sociotechnical framework for understanding the stickiness of misinformation
In the absence of a national, coordinated, response to COVID-19, state and local representatives had to create and enforce individualized plans to protect their constituents. Alongside the challenge of trying to curb the virus, public health officials also had to contend with the spread of false information. This problematic content often contradicted safeguards, like masks, while promoting unverified and potentially lethal treatments. One of the most active groups denying the threat of COVID is The Reopen the States Movement. By combining qualitative content analysis with ethnographic observations of public ReOpen groups on Facebook, this paper provides a better understanding of the central narratives circulating among ReOpen members and the information they relied on to support their arguments. Grounded in notions of individualism and self-inquiry, members sought to reinterpret datasets to downplay the threat of COVID and suggest public safety workarounds. When the platform tried to flag problematic content, lack of institutional trust had members doubting the validity of the fact-checkers, highlight the tight connection between misinformation and epistemology.
Urban life: a model of people and places
We introduce the Urban Life agent-based simulation used by the Ground Truth program to capture the innate needs of a human-like population and explore how such needs shape social constructs such as friendship and wealth. Urban Life is a spatially explicit model to explore how urban form impacts agents' daily patterns of life. By meeting up at places agents form social networks, which in turn affect the places the agents visit. In our model, location and co-location affect all levels of decision making as agents prefer to visit nearby places. Co-location is necessary (but not sufficient) to connect agents in the social network. The Urban Life model was used in the Ground Truth program as a virtual world testbed to produce data in a setting in which the underlying ground truth was explicitly known. Data was provided to research teams to test and validate Human Domain research methods to an extent previously impossible. This paper summarizes our Urban Life model's design and simulation along with a description of how it was used to test the ability of Human Domain research teams to predict future states and to prescribe changes to the simulation to achieve desired outcomes in our simulated world.
Does big data serve policy? Not without context. An experiment with in silico social science
The DARPA Ground Truth project sought to evaluate social science by constructing four varied simulated social worlds with hidden causality and unleashed teams of scientists to collect data, discover their causal structure, predict their future, and prescribe policies to create desired outcomes. This large-scale, long-term experiment of in silico social science, about which the ground truth of simulated worlds was known, but not by us, reveals the limits of contemporary quantitative social science methodology. First, problem solving without a shared ontology-in which many world characteristics remain existentially uncertain-poses strong limits to quantitative analysis even when scientists share a common task, and suggests how they could become insurmountable without it. Second, data labels biased the associations our analysts made and assumptions they employed, often away from the simulated causal processes those labels signified, suggesting limits on the degree to which analytic concepts developed in one domain may port to others. Third, the current standard for computational social science publication is a demonstration of novel causes, but this limits the relevance of models to solve problems and propose policies that benefit from the simpler and less surprising answers associated with most important causes, or the combination of all causes. Fourth, most singular quantitative methods applied on their own did not help to solve most analytical challenges, and we explored a range of established and emerging methods, including probabilistic programming, deep neural networks, systems of predictive probabilistic finite state machines, and more to achieve plausible solutions. However, despite these limitations common to the current practice of computational social science, we find on the positive side that even imperfect knowledge can be sufficient to identify robust prediction if a more pluralistic approach is applied. Applying competing approaches by distinct subteams, including at one point the vast TopCoder.com global community of problem solvers, enabled discovery of many aspects of the relevant structure underlying worlds that singular methods could not. Together, these lessons suggest how different a policy-oriented computational social science would be than the computational social science we have inherited. Computational social science that serves policy would need to endure more failure, sustain more diversity, maintain more uncertainty, and allow for more complexity than current institutions support.
Food supply network disruption and mitigation: an integrated perspective of traceability technology and network structure
The 2019 coronavirus disease (COVID-19) epidemic has caused serious disruptions in food supply networks. Based on the case of the remerging epidemic in China, this paper aims to investigate food supply network disruption and its mitigation from technical and structural perspectives. To solve the optimal policy choice problem that how to improve mitigation capability of food supply networks by using traceability technology and adjusting network structure, the occurrence mechanism of food supply network disruptions is revealed through a case study of the remerging COVID-19 outbreak in Beijing's Xinfadi market. Five typical traceability solutions are proposed to mitigate network disruptions and their technical attributes are analyzed to establish disruption mitigation models. The structure of food supply networks is also controlled to mitigate disruptions. The structural attributes of three fundamental networks are extracted to adjust the network connections pattern in disruption mitigation models. Next, simulation experiments involving the disruption mitigation models are carried out to explore the independent and joint effects of traceability technology and network structure on mitigation capability. The findings suggest that accuracy makes a more positive effect on the mitigation capability of food supply networks than timeliness due to the various technical compositions behind them; the difference between these effects determines the choice decision of supply networks on traceability solution types. Likewise, betweenness centralization makes a positive effect but degree centralization makes a negative effect on mitigation capability because intermediary firms and focal firms in food supply networks have different behavior characteristics; these effects are both regulated by supply network types and exhibit different sensitivities. As for the joint effect of technical and structural attributes on mitigation capability, the joint effect of accuracy and betweenness centralization is bigger than the independent effects but smaller than their sum; the joint effect of timeliness and betweenness centralization depends on networks type; while the positive effect of accuracy or timeliness on mitigation capability is greater than the negative effect of degree centralization; theses joint effects are caused by the complicated interactive effects between technical composition and behaviors of intermediary firms or focal firms. These findings contribute to disruption management and decision-making theories and practices.
Social cybersecurity: an emerging science
With the rise of online platforms where individuals could gather and spread information came the rise of online cybercrimes aimed at taking advantage of not just single individuals but collectives. In response, researchers and practitioners began trying to understand this digital playground and the way in which individuals who were socially and digitally embedded could be manipulated. What is emerging is a new scientific and engineering discipline-social cybersecurity. This paper defines this emerging area, provides case examples of the research issues and types of tools needed, and lays out a program of research in this area.
SCAMP's stigmergic model of social conflict
SCAMP (Social Causality using Agents with Multiple Perspectives) is one of four social simulators that generated socially realistic data for the Ground Truth program. Unlike the other three simulators, it is based on a computational principle, stigmergy, inspired by social insects. Using this approach, we modeled conflict in a nation-state inspired by the ongoing scenario in Syria. This paper summarizes stigmergy and describes the Conflict World we built in SCAMP.