Gendered discourse in the digital realm: Dataset of 48 Gender Linguistic Patterns

Solomiia Fedushko
Wed, 01/17/2024 - 17:41
The dataset explores the linguistic characteristics of Ukrainian online community members on "Lviv. Forum Ridne City" ( based on gender (female/male). It includes vectors of male and female profiles, along with 36 control vectors for 18 women's profiles and 18 men's profiles. The dataset includes 48 linguistic characteristics of gender in online communication. The linguistic features analyzed encompass a wide range, including apology, modal designs, emotions, profanity, sports and politics references, and more. The dataset aims to provide insights into the diverse linguistic patterns exhibited by online community members, shedding light on communication nuances influenced by gender within the specified Ukrainian forum.


This dataset delves into the linguistic patterns of Ukrainian online community members within the "Lviv. Forum Ridne City" (, categorized by gender (female/male).

It encompasses 48 distinct linguistic characteristics, including Apology, Modal designs, Mention of emotions and feelings, Evading the answer, Time links, Affirmation, Order, Long words, Profanity, Adjective expressions without meaning Social and family vocabulary, Conditionality of actions, Sports and politics, automobile technological and innovative vocabulary, Reference to quantity, value, Indication of a person, event, etc., An indication of a person speaking about himself, Verbal fillings, Phraseologisms, Spatial references, Lack of confidence, Axiological modal judgments, Politeness, Accentuation, Amplification of meaning, Euphemisms, Point to multiple speakers, Geographic references, Diminutive caressing forms, Oaths and oaths, Remembering past events, Exclamatory intonation, Expression of support, Contrasting, Consent, Rehash, Indirect commands and requests, Implication, Impersonal sentence, Meaningless forms, Specification, Denial, Discussion of current problems and current topics of today, Question, Separate clause, Direct quote, Humor, Elliptical sentences, Justification

The dataset includes vectors representing a male and female profile, along with 36 control vectors for 18 women's and 18 men's online characteristics. These linguistic features provide a comprehensive analysis, offering valuable insights into the nuanced communication patterns exhibited by online community members in the specified Ukrainian forum.