A systematic review of the use of topic models for short text social media analysis
Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models' limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models.
Impact of employee digital competence on the relationship between digital autonomy and innovative work behavior: a systematic review
With the advent of the COVID-19 pandemic, the level of concern regarding employee digital competence has increased significantly. Several studies provide different surveys, but they cannot describe the relationship between digital autonomy and innovative work behaviour concerning the impact of employee digital competence. Hence, it is necessary to conduct a survey that provides a deeper understanding of these concerns and suggests a suitable study for other researchers. Using scientific publication databases and adhering to the PRISMA statement, this systematic literature review aims to offer a current overview of employee digital competence impact on the relationship between digital autonomy and innovative work behaviour from 2015 to 2022, covering definitions, research purposes, methodologies, outcomes, and limitations. When reviewing the selected articles, 18 articles were examined under relationship topics, and 12 articles reported on impact topics under different tasks. The main findings highlight the significance of digital competence and autonomy in promoting employee creativity, learning, and sharing knowledge. According to the review findings, employees with greater digital autonomy are more likely to engage in innovative work, leading to improved job performance and empowerment. Therefore, the development of digital autonomy prioritizes organizations by providing access to digital tools, training, and a supportive work environment. Overall, the current review indicates a strong positive correlation between digital autonomy, innovative work behaviour, and employee impact. This underscores the importance for organizations to not only participate in digital competence and skills, but also to create a culture that values autonomy, creativity, and innovation among its employees.
A systematic review of social network sentiment analysis with comparative study of ensemble-based techniques
Sentiment Analysis (SA) of text reviews is an emerging concern in Natural Language Processing (NLP). It is a broadly active method for analyzing and extracting opinions from text using individual or ensemble learning techniques. This field has unquestionable potential in the digital world and social media platforms. Therefore, we present a systematic survey that organizes and describes the current scenario of the SA and provides a structured overview of proposed approaches from traditional to advance. This work also discusses the SA-related challenges, feature engineering techniques, benchmark datasets, popular publication platforms, and best algorithms to advance the automatic SA. Furthermore, a comparative study has been conducted to assess the performance of bagging and boosting-based ensemble techniques for social network SA. Bagging and Boosting are two major approaches of ensemble learning that contain various ensemble algorithms to classify sentiment polarity. Recent studies recommend that ensemble learning techniques have the potential of applicability for sentiment classification. This analytical study examines the bagging and boosting-based ensemble techniques on four benchmark datasets to provide extensive knowledge regarding ensemble techniques for SA. The efficiency and accuracy of these techniques have been measured in terms of TPR, FPR, Weighted F-Score, Weighted Precision, Weighted Recall, Accuracy, ROC-AUC curve, and Run-Time. Moreover, comparative results reveal that bagging-based ensemble techniques outperformed boosting-based techniques for text classification. This extensive review aims to present benchmark information regarding social network SA that will be helpful for future research in this field.
An exhaustive review of the metaheuristic algorithms for search and optimization: taxonomy, applications, and open challenges
As the world moves towards industrialization, optimization problems become more challenging to solve in a reasonable time. More than 500 new metaheuristic algorithms (MAs) have been developed to date, with over 350 of them appearing in the last decade. The literature has grown significantly in recent years and should be thoroughly reviewed. In this study, approximately 540 MAs are tracked, and statistical information is also provided. Due to the proliferation of MAs in recent years, the issue of substantial similarities between algorithms with different names has become widespread. This raises an essential question: can an optimization technique be called 'novel' if its search properties are modified or almost equal to existing methods? Many recent MAs are said to be based on 'novel ideas', so they are discussed. Furthermore, this study categorizes MAs based on the number of control parameters, which is a new taxonomy in the field. MAs have been extensively employed in various fields as powerful optimization tools, and some of their real-world applications are demonstrated. A few limitations and open challenges have been identified, which may lead to a new direction for MAs in the future. Although researchers have reported many excellent results in several research papers, review articles, and monographs during the last decade, many unexplored places are still waiting to be discovered. This study will assist newcomers in understanding some of the major domains of metaheuristics and their real-world applications. We anticipate this resource will also be useful to our research community.
Multiple criteria decision analytic methods in management with T-spherical fuzzy information
With a focus on T-spherical fuzzy (T-SF) sets, the aim of this paper is to create a split-new appraisal mechanism and an innovative decision analytic method for use with multiple-criteria assessment and selection in uncertain situations. The T-SF frame is the latest recent advancement in fuzzy settings and uses four facets (consisting of membership grades of positivity, neutrality, negativity, and refusal) to elucidate complex uncertainties, thereby evidently reducing information loss, in anticipation of fully manifesting indistinct and equivocal information. This paper adds to the body of knowledge regarding multiple criteria choice modeling by raising T-SF correlation-oriented measurements connected to the fixed and displaced ideal/anti-ideal benchmarks and by creating an approachable appraisal mechanism for advancing a T-SF decision analytic methodology. Consider, in particular, the performance ratings of available options in terms of judging criteria under the T-SF type of uncertainties. This research gives correlation-oriented measurements focusing on two varieties of maximum and square root functions in T-SF situations, which serve as a solid foundation for an efficacious appraisal mechanism from two views of anchored judgments corresponding to the fixed and displaced benchmarks. The T-SF Minkowski distance index is generated to integrate the outranking and outranked identifiers relying on correlation-oriented measurements for figuring out the local outranking and outranked indices. The T-SF decision analytic procedures are constructed using a new appraisal significance index that is founded on certain valuable insights of correlation-oriented maximizing and minimizing indices as well as global outranking and outranked indices. Additionally, a concrete location selection dilemma is dealt with in this research to showcase the applicability and efficiency of the suggested T-SF decision analytic methodology. Sensitivity analyses and comparative studies are carried out to investigate substantial modifications in pertinent parameters and to confirm the robustness of the predominance relationships among the available options. The suggested approaches are adaptable, flexible, and reliable, according to the application outcomes and comparison findings. This research provides four scientific contributions: (1) the utilization of T-SF correlation coefficients as the basis for prioritization analysis involving multiple criteria assessments, (2) the evolution of the T-SF Minkowski distance index to model outranking decision-making processes, (3) the creation of a reliable appraisal mechanism based on T-SF correlation-oriented measurements for intelligent decision support, and (4) the advancement of computational tools and procedures (e.g., correlation-oriented maximizing and minimizing indices, global outranking and outranked indices, and appraisal significance indices) to perform the decision analytic procedure in T-SF settings. In terms of managerial implications, the solution findings support the employment of the fixed ideal/anti-ideal benchmarking mechanism, as its measurements and indices are easy to operate and suitably sensitive. Next, in practical implementations of the T-SF decision analytic procedure, it is advised to utilize the T-SF Manhattan distance index for calculating convenience. Finally, the T-SF decision analytic techniques offer fundamental ideas and measurements appropriate for manipulating T-SF information in complex decision situations, thereby increasing the application potential in the area of decision-making with information uncertainty.
Classification of spinal curvature types using radiography images: deep learning versus classical methods
Scoliosis is a spinal abnormality that has two types of curves (C-shaped or S-shaped). The vertebrae of the spine reach an equilibrium at different times, which makes it challenging to detect the type of curves. In addition, it may be challenging to detect curvatures due to observer bias and image quality. This paper aims to evaluate spinal deformity by automatically classifying the type of spine curvature. Automatic spinal curvature classification is performed using SVM and KNN algorithms, and pre-trained Xception and MobileNetV2 networks with SVM as the final activation function to avoid vanishing gradient. Different feature extraction methods should be used to investigate the SVM and KNN machine learning methods in detecting the curvature type. Features are extracted through the representation of radiographic images. These representations are of two groups: (i) Low-level image representation techniques such as texture features and (ii) local patch-based representations such as Bag of Words (BoW). Such features are utilized by various algorithms for classification by SVM and KNN. The feature extraction process is automated in pre-trained deep networks. In this study, 1000 anterior-posterior (AP) radiographic images of the spine were collected as a private dataset from Shafa Hospital, Tehran, Iran. The transfer learning was used due to the relatively small private dataset of anterior-posterior radiology images of the spine. Based on the results of these experiments, pre-trained deep networks were found to be approximately 10% more accurate than classical methods in classifying whether the spinal curvature is C-shaped or S-shaped. As a result of automatic feature extraction, it has been found that the pre-trained Xception and mobilenetV2 networks with SVM as the final activation function for controlling the vanishing gradient perform better than the classical machine learning methods of classification of spinal curvature types.
Knowledge Graphs: Opportunities and Challenges
With the explosive growth of artificial intelligence (AI) and big data, it has become vitally important to organize and represent the enormous volume of knowledge appropriately. As graph data, knowledge graphs accumulate and convey knowledge of the real world. It has been well-recognized that knowledge graphs effectively represent complex information; hence, they rapidly gain the attention of academia and industry in recent years. Thus to develop a deeper understanding of knowledge graphs, this paper presents a systematic overview of this field. Specifically, we focus on the opportunities and challenges of knowledge graphs. We first review the opportunities of knowledge graphs in terms of two aspects: (1) AI systems built upon knowledge graphs; (2) potential application fields of knowledge graphs. Then, we thoroughly discuss severe technical challenges in this field, such as knowledge graph embeddings, knowledge acquisition, knowledge graph completion, knowledge fusion, and knowledge reasoning. We expect that this survey will shed new light on future research and the development of knowledge graphs.
Applications of AI in oil and gas projects towards sustainable development: a systematic literature review
Oil and gas construction projects are critical for meeting global demand for fossil fuels, but they also present unique risks and challenges that require innovative construction approaches. Artificial Intelligence (AI) has emerged as a promising technology for tackling these challenges, and this study examines its applications for sustainable development in the oil and gas industry. Using a systematic literature review (SLR), this research evaluates research trends from 2011 to 2022. It provides a detailed analysis of how AI suits oil and gas construction. A total of 115 research articles were reviewed to identify original contributions, and the findings indicate a positive trend in AI research related to oil and gas construction projects, especially after 2016. The originality of this study lies in its comprehensive analysis of the latest research on AI applications in the oil and gas industry and its contribution to developing recommendations for improving the sustainability of oil and gas projects. This research's originality is in providing insight into the most promising AI applications and methodologies that can help drive sustainable development in the oil and gas industry.
A systematic review of artificial intelligence impact assessments
Artificial intelligence (AI) is producing highly beneficial impacts in many domains, from transport to healthcare, from energy distribution to marketing, but it also raises concerns about undesirable ethical and social consequences. AI impact assessments (AI-IAs) are a way of identifying positive and negative impacts early on to safeguard AI's benefits and avoid its downsides. This article describes the first systematic review of these AI-IAs. Working with a population of 181 documents, the authors identified 38 actual AI-IAs and subjected them to a rigorous qualitative analysis with regard to their purpose, scope, organisational context, expected issues, timeframe, process and methods, transparency and challenges. The review demonstrates some convergence between AI-IAs. It also shows that the field is not yet at the point of full agreement on content, structure and implementation. The article suggests that AI-IAs are best understood as means to stimulate reflection and discussion concerning the social and ethical consequences of AI ecosystems. Based on the analysis of existing AI-IAs, the authors describe a baseline process of implementing AI-IAs that can be implemented by AI developers and vendors and that can be used as a critical yardstick by regulators and external observers to evaluate organisations' approaches to AI.
Open-source intelligence: a comprehensive review of the current state, applications and future perspectives in cyber security
The volume of data generated by today's digitally connected world is enormous, and a significant portion of it is publicly available. These data sources are web archives, public databases, and social networks such as Facebook, Twitter, LinkedIn, Emails, Telegrams, etc. Open-source intelligence (OSINT) extracts information from a collection of publicly available and accessible data. OSINT can provide a solution to the challenges in extracting and gathering intelligence from various publicly available information and social networks. OSINT is currently expanding at an incredible rate, bringing new artificial intelligence-based approaches to address issues of national security, political campaign, the cyber industry, criminal profiling, and society, as well as cyber threats and crimes. In this paper, we have described the current state of OSINT tools/techniques and the state of the art for various applications of OSINT in cyber security. In addition, we have discussed the challenges and future directions to develop autonomous models. These models can provide solutions for different social network-based security, digital forensics, and cyber crime-based problems using various machine learning (ML), deep learning (DL) and artificial intelligence (AI) with OSINT.
Review on chest pathogies detection systems using deep learning techniques
Chest radiography is the standard and most affordable way to diagnose, analyze, and examine different thoracic and chest diseases. Typically, the radiograph is examined by an expert radiologist or physician to decide about a particular anomaly, if exists. Moreover, computer-aided methods are used to assist radiologists and make the analysis process accurate, fast, and more automated. A tremendous improvement in automatic chest pathologies detection and analysis can be observed with the emergence of deep learning. The survey aims to review, technically evaluate, and synthesize the different computer-aided chest pathologies detection systems. The state-of-the-art of single and multi-pathologies detection systems, which are published in the last five years, are thoroughly discussed. The taxonomy of image acquisition, dataset preprocessing, feature extraction, and deep learning models are presented. The mathematical concepts related to feature extraction model architectures are discussed. Moreover, the different articles are compared based on their contributions, datasets, methods used, and the results achieved. The article ends with the main findings, current trends, challenges, and future recommendations.
Medical image data augmentation: techniques, comparisons and interpretations
Designing deep learning based methods with medical images has always been an attractive area of research to assist clinicians in rapid examination and accurate diagnosis. Those methods need a large number of datasets including all variations in their training stages. On the other hand, medical images are always scarce due to several reasons, such as not enough patients for some diseases, patients do not want to allow their images to be used, lack of medical equipment or equipment, inability to obtain images that meet the desired criteria. This issue leads to bias in datasets, overfitting, and inaccurate results. Data augmentation is a common solution to overcome this issue and various augmentation techniques have been applied to different types of images in the literature. However, it is not clear which data augmentation technique provides more efficient results for which image type since different diseases are handled, different network architectures are used, and these architectures are trained and tested with different numbers of data sets in the literature. Therefore, in this work, the augmentation techniques used to improve performances of deep learning based diagnosis of the diseases in different organs (brain, lung, breast, and eye) from different imaging modalities (MR, CT, mammography, and fundoscopy) have been examined. Also, the most commonly used augmentation methods have been implemented, and their effectiveness in classifications with a deep network has been discussed based on quantitative performance evaluations. Experiments indicated that augmentation techniques should be chosen carefully according to image types.
Video summarization using deep learning techniques: a detailed analysis and investigation
One of the critical multimedia analysis problems in today's digital world is video summarization (VS). Many VS methods have been suggested based on deep learning methods. Nevertheless, These are inefficient in processing, extracting, and deriving information in the minimum amount of time from long-duration videos. Detailed analysis and investigation of numerous deep learning approach accomplished to determine root of problems connected with different deep learning methods in identifying and summarizing the essential activities in such videos. Various deep learning techniques have been investigated and examined to detect the event and summarization capability for detecting and summarizing multiple activities. Keyframe selection Event detection, categorization, and the activity feature summarization correspond to each activity. The limitations related to each category are also discussed in depth. Concerns about detecting low activity using the deep network on various types of public datasets are also discussed. Viable strategies are suggested to evaluate and improve the generated video summaries on such datasets. Moreover, Potential recommended applications based on literature are listed out. Various deep learning tools for experimental analysis have also been discussed in the paper. Future directions are presented for further exploration of research in VS using deep learning strategies.
Federated learning for 6G-enabled secure communication systems: a comprehensive survey
Machine learning (ML) and Deep learning (DL) models are popular in many areas, from business, medicine, industries, healthcare, transportation, smart cities, and many more. However, the conventional centralized training techniques may not apply to upcoming distributed applications, which require high accuracy and quick response time. It is mainly due to limited storage and performance bottleneck problems on the centralized servers during the execution of various ML and DL-based models. However, federated learning (FL) is a developing approach to training ML models in a collaborative and distributed manner. It allows the full potential exploitation of these models with unlimited data and distributed computing power. In FL, edge computing devices collaborate to train a global model on their private data and computational power without sharing their private data on the network, thereby offering privacy preservation by default. But the distributed nature of FL faces various challenges related to data heterogeneity, client mobility, scalability, and seamless data aggregation. Moreover, the communication channels, clients, and central servers are also vulnerable to attacks which may give various security threats. Thus, a structured vulnerability and risk assessment are needed to deploy FL successfully in real-life scenarios. Furthermore, the scope of FL is expanding in terms of its application areas, with each area facing different threats. In this paper, we analyze various vulnerabilities present in the FL environment and design a literature survey of possible threats from the perspective of different application areas. Also, we review the most recent defensive algorithms and strategies used to guard against security and privacy threats in those areas. For a systematic coverage of the topic, we considered various applications under four main categories: space, air, ground, and underwater communications. We also compared the proposed methodologies regarding the underlying approach, base model, datasets, evaluation matrices, and achievements. Lastly, various approaches' future directions and existing drawbacks are discussed in detail.
Sentiment analysis: A survey on design framework, applications and future scopes
Sentiment analysis is a solution that enables the extraction of a summarized opinion or minute sentimental details regarding any topic or context from a voluminous source of data. Even though several research papers address various sentiment analysis methods, implementations, and algorithms, a paper that includes a thorough analysis of the process for developing an efficient sentiment analysis model is highly desirable. Various factors such as extraction of relevant sentimental words, proper classification of sentiments, dataset, data cleansing, etc. heavily influence the performance of a sentiment analysis model. This survey presents a systematic and in-depth knowledge of different techniques, algorithms, and other factors associated with designing an effective sentiment analysis model. The paper performs a critical assessment of different modules of a sentiment analysis framework while discussing various shortcomings associated with the existing methods or systems. The paper proposes potential multidisciplinary application areas of sentiment analysis based on the contents of data and provides prospective research directions.
A framework for measuring the training efficiency of a neural architecture
Measuring Efficiency in neural network system development is an open research problem. This paper presents an experimental framework to measure the training efficiency of a neural architecture. To demonstrate our approach, we analyze the training efficiency of Convolutional Neural Networks and Bayesian equivalents on the MNIST and CIFAR-10 tasks. Our results show that training efficiency decays as training progresses and varies across different stopping criteria for a given neural model and learning task. We also find a non-linear relationship between training stopping criteria, training Efficiency, model size, and training Efficiency. Furthermore, we illustrate the potential confounding effects of overtraining on measuring the training efficiency of a neural architecture. Regarding relative training efficiency across different architectures, our results indicate that CNNs are more efficient than BCNNs on both datasets. More generally, as a learning task becomes more complex, the relative difference in training efficiency between different architectures becomes more pronounced.
A review of evaluation approaches for explainable AI with applications in cardiology
Explainable artificial intelligence (XAI) elucidates the decision-making process of complex AI models and is important in building trust in model predictions. XAI explanations themselves require evaluation as to accuracy and reasonableness and in the context of use of the underlying AI model. This review details the evaluation of XAI in cardiac AI applications and has found that, of the studies examined, 37% evaluated XAI quality using literature results, 11% used clinicians as domain-experts, 11% used proxies or statistical analysis, with the remaining 43% not assessing the XAI used at all. We aim to inspire additional studies within healthcare, urging researchers not only to apply XAI methods but to systematically assess the resulting explanations, as a step towards developing trustworthy and safe models.
Knowledge transfer in lifelong machine learning: a systematic literature review
ifelong achine earning (LML) denotes a scenario involving multiple sequential tasks, each accompanied by its respective dataset, in order to solve specific learning problems. In this context, the focus of LML techniques is on utilizing already acquired knowledge to adapt to new tasks efficiently. Essentially, LML concerns about facing new tasks while exploiting the knowledge previously gathered from earlier tasks not only to help in adapting to new tasks but also to enrich the understanding of past ones. By understanding this concept, one can better grasp one of the major obstacles in LML, known as nowledge ransfer (KT). This systematic literature review aims to explore state-of-the-art KT techniques within LML and assess the evaluation metrics and commonly utilized datasets in this field, thereby keeping the LML research community updated with the latest developments. From an initial pool of 417 articles from four distinguished databases, 30 were deemed highly pertinent for the information extraction phase. The analysis recognizes four primary KT techniques: Replay, Regularization, Parameter Isolation, and Hybrid. This study delves into the characteristics of these techniques across both neural network (NN) and non-neural network (non-NN) frameworks, highlighting their distinct advantages that have captured researchers' interest. It was found that the majority of the studies focused on supervised learning within an NN modelling framework, particularly employing Parameter Isolation and Hybrid for KT. The paper concludes by pinpointing research opportunities, including investigating non-NN models for Replay and exploring applications outside of computer vision (CV).
A Comprehensive review of data-driven approaches for forecasting production from unconventional reservoirs: best practices and future directions
Prediction of well production from unconventional reservoirs is a complex problem given an incomplete understanding of physics despite large amounts of data. Recently, Data Analytics Techniques (DAT) have emerged as an effective approach for production forecasting for unconventional reservoirs. In some of these approaches, DAT are combined with physics-based models to capture the essential physical mechanisms of fluid flow in porous media, while leveraging the power of data-driven methods to account for uncertainties and heterogeneities. Here, we provide an overview of the applications and performance of DAT for production forecasting of unconventional reservoirs examining and comparing predictive models using different algorithms, validation benchmarks, input data, number of wells, and formation types. We also discuss the strengths and limitations of each model, as well as the challenges and opportunities for future research in this field. Our analysis shows that machine learning (ML) based models can achieve satisfactory performance in forecasting production from unconventional reservoirs. We measure the performance of the models using two dimensionless metrics: mean absolute percentage error (MAPE) and coefficient of determination (R). The predicted and actual production data show a high degree of agreement, as most of the models have a low error rate and a strong correlation. Specifically, ~ 65% of the models have MAPE less than 20%, and more than 80% of the models have R higher than 0.6. Therefore, we expect that DAT can improve the reliability and robustness of production forecasting for unconventional resources. However, we also identify some areas for future improvement, such as developing new ML algorithms, combining DAT with physics-based models, and establishing multi-perspective approaches for comparing model performance.
Three-way decisions in generalized intuitionistic fuzzy environments: survey and challenges
Enhancing decision-making under risks is crucial in various fields, and three-way decision (3WD) methods have been extensively utilized and proven to be effective in numerous scenarios. However, traditional methods may not be sufficient when addressing intricate decision-making scenarios characterized by uncertain and ambiguous information. In response to this challenge, the generalized intuitionistic fuzzy set (IFS) theory extends the conventional fuzzy set theory by introducing two pivotal concepts, i.e., membership degrees and non-membership degrees. These concepts offer a more comprehensive means of portraying the relationship between elements and fuzzy concepts, thereby boosting the ability to model complex problems. The generalized IFS theory brings about heightened flexibility and precision in problem-solving, allowing for a more thorough and accurate description of intricate phenomena. Consequently, the generalized IFS theory emerges as a more refined tool for articulating fuzzy phenomena. The paper offers a thorough review of the research advancements made in 3WD methods within the context of generalized intuitionistic fuzzy (IF) environments. First, the paper summarizes fundamental aspects of 3WD methods and the IFS theory. Second, the paper discusses the latest development trends, including the application of these methods in new fields and the development of new hybrid methods. Furthermore, the paper analyzes the strengths and weaknesses of research methods employed in recent years. While these methods have yielded impressive outcomes in decision-making, there are still some limitations and challenges that need to be addressed. Finally, the paper proposes key challenges and future research directions. Overall, the paper offers a comprehensive and insightful review of the latest research progress on 3WD methods in generalized IF environments, which can provide guidance for scholars and engineers in the intelligent decision-making field with situations characterized by various uncertainties.
Contrasting Linguistic Patterns in Human and LLM-Generated News Text
We conduct a quantitative analysis contrasting human-written English news text with comparable large language model (LLM) output from six different LLMs that cover three different families and four sizes in total. Our analysis spans several measurable linguistic dimensions, including morphological, syntactic, psychometric, and sociolinguistic aspects. The results reveal various measurable differences between human and AI-generated texts. Human texts exhibit more scattered sentence length distributions, more variety of vocabulary, a distinct use of dependency and constituent types, shorter constituents, and more optimized dependency distances. Humans tend to exhibit stronger negative emotions (such as fear and disgust) and less joy compared to text generated by LLMs, with the toxicity of these models increasing as their size grows. LLM outputs use more numbers, symbols and auxiliaries (suggesting objective language) than human texts, as well as more pronouns. The sexist bias prevalent in human text is also expressed by LLMs, and even magnified in all of them but one. Differences between LLMs and humans are larger than between LLMs.