You’re reading default.blog. An emotional scrapbook of the Internet, technology, and the future.
Something strange happened this week. Media commentators are suddenly recognizing, almost in unison, that many major cultural shifts of recent years were accelerated by Covid lockdowns. We’re belatedly realizing that all that time online changed us.
The rise of The Free Press and the broader non-left media ecosystem, the rightward drift of many Internet personalities, the explosion of independent projects—all trace back to when institutional trust collapsed and we moved our lives almost entirely online. But confinement warped our minds in other ways too. When we spent more time in cyberspace than meatspace, our perception contorted. Daniel Kolitz’s awe-inspiring (and I don’t use that word lightly) Harper’s piece on the gooner subculture captures that shift, as do phenomena like TikTok’s monopoly on trend creation, the murders of UnitedHealth CEO Brian Thompson by Luigi Mangione and of Charlie Kirk by Tyler Robinson, the growing visibility of networks like 764, the tide of “slop violence,” the Pentagon leaks by Jack Texeira, and the spread of “algo-speak.”
These aren’t isolated oddities—they’re symptoms of the same deeper reorientation. Unsurprisingly, everyone is now discussing the post-literate turn, also accelerated by Covid lockdowns: declining reading habits, collapsing attention spans, the screen eclipsing the page. One aspect of this transformation remains under-examined: the rise of voice.
Voice memos, podcasts, audiobooks.
I myself listen to more Substack posts than I read, and have recently had to bribe myself to start reading them again. But when I do read, I vocalize. Our machines have also started to talk back, though slowly, and incrementally — first Alexa and Siri, now ChatGPT. We’re both consuming more sound and thinking out loud. And as I’ve argued before, I believe that’s what explains the rise of “crying in your car videos” — a sense that we need to mediate our own expression to understand our feelings.
Voice collapses the distance between thought and expression. It is the perfect register for an age that values presence over patience. When we talk to a device, or listen to someone talk into one, we bypass the delay that literacy once demanded. The pause between idea and expression, that pause that made writing possible, has nearly vanished.
Our metaphors — if you think about it — further emphasize this shift. “Desktop,” “file,” “window”: as
writes so often about, these belong to a static era of computing. Voice is ambient and environmental. If text was something we looked at, voice is something we move through. has noted that reading once trained us to think in sequence—to slow down and structure thought—and that this skill is fading. In the United States, reading for pleasure has collapsed; in Britain, a third of adults no longer read books at all. The “reading revolution” that expanded consciousness in the eighteenth century is in retreat.But what’s emerging is not illiteracy, as everyone from
to Marriott to to my interviewee have pointed out: it’s post-literacy. For media ecologist Mir, the specifics of that change mean “digital orality” — a return to oral patterns of thought, but mediated through digital technology. Mir argues that voice isn’t the point and I’m focusing on the wrong thing. Digital orality, he insists, happens primarily through text and will continue to. The cognitive shift toward impulsivity and environmental immersion doesn’t require speaking at all, even if it may occasionally include it.He might be right!
But still, I can’t shake the feeling that voice technologies are doing something distinct that his framework doesn’t fully capture — I just don’t know what yet.
What follows is our conversation1 where we explore this tension. From post-literacy to post-human, perhaps?
Katherine: Does the rapid rise of voice-driven technologies (e.g., Siri, Alexa, voice memos) impact the shift away from a print culture? If so, how?
Andrey: Digital orality is not vocal or oral—it’s not its primary feature. It’s not about voicing information or communication. Digital orality is a cultural and cognitive phenomenon induced by new media, which may or may not use vocal/voice/audial channels. Before writing, humans were immersed in a physical (nature) and social (tribe) environment. They received information from their surroundings simultaneously, in the fashion of “acoustic space,” as McLuhan called it. Writing detached humans from the environment and forced them to immerse themselves in the contemplation of ideas and thoughts.
Unlike signals of the world that come from around through all senses working at once, writing forced the isolation of vision from other senses into a cognitive state, which McLuhan called “visual space.” The isolated sense of vision made other senses numb when a person writes or reads. This isolation of vision and numbness of other senses turned sensory capacity of vision into a cognitive faculty of inner vision—what Walter Ong called the “inward turn.
Writing enabled several cognitive transformations.
First, isolated vision and environmental detachment allowed long focus on ideas—something impractical and even dangerous in the natural environment. If you live in nature and concentrate for to long on own ideas while detaching from the environment, someone or something can eat you. An oral/tribal person HAS to be immersed in surroundings, not ideas.
Second, unlike the immediate impulses typical of orality, writing and reading enabled a delay of reaction, which was used for contemplation. This led to deliberation, which, again, is not typical of “natural” environmental immersion, when individuals react fast, impulsive.
Third, writing, just technically, requires a linear organization of content. You need to write any content word after word, sentence after sentence, idea after idea—one thing at a time. The linear nature of writing structured not only writing itself but also thinking and, eventually, the world. The literate mind and the world perceived by it are structured because of the mere technicality of writing.
In short, the cognitive “inward turn,” enabled by writing, led to theorizing, classification, individualism, self-reflection, structuring of knowledge, rationalism, etc.
McLuhan noticed that radio and television—electronic media—require “empathic involvement.” They immerse viewers and listeners in an electronically induced environment in the fashion of “acoustic space,” which was typical of orality. Vision is not isolated from other senses; hearing is also involved, and it all resembles full-sensory involvement. The focus of attention goes not inside, into ideas and thoughts, but outside, to environmental signals – news and entertainments. This resemblance of electronic media consumption to the perception of the natural – spherical and simultaneous – environment led McLuhan to the idea of retribalization, but now at the level of the Global Village.
So, the “vocality” of delivering information (audio, voice, acoustic) is not essential for distinguishing McLuhan’s acoustic/visual space and, respectively, the cognitive conditions of orality and literacy. What is essential are the sensory-cognitive effects of a medium.
Now, digital media allowed not just “empathic involvement” in the induced environment but also empathic engagement. They brought oral-type interactivity even to writing. Text in email, and especially in messengers and on social media, is used in a conversational manner, as an interaction in a shared environment, similar to talking. And this is digital orality.
It is “orality” not because it’s “vocal” (it might be—but that is not essential) but because it is conversational, impulsive, and immersive. So, digital orality is not a “phonetic” phenomenon—it’s a cognitive and cultural condition. It’s a hybrid of literacy and orality—it inherits the effects of literacy (computers and the Internet are effects of literacy, specifically alphabetic/print literacy) and reintroduces—retrieves—the features of orality.
Paradoxically, the main “technical medium” of digital orality is still text; however, not exactly the text of books (the text of literacy), but texting—typed letters and other signs (ideograms, pictograms) that serve conversation and impulsive self-expression in the fashion of oral/tribal communication.
Digital orality completes McLuhan’s retribalization. Digital orality also completes the reversal of Ong’s “inward turn,” but in a peculiar fashion, like a Möbius strip—an “inward-outward turn,” as digital users are still physically isolated but immersed in a digitally shared environment.
I explain all this in detail in my “Digital Future in the Rearview Mirror: Jaspers’ Axial Age and Logan’s Alphabet Effect” (2024).
Sorry for the long digression […] The voice feature of any device is not the point; it’s just one of the carriers of a much greater phenomenon, digital orality.
Now to answer your first question: Yes, voice-driven technologies (e.g., Siri, Alexa, voice memos) further the shift away from print culture.
First of all, voice-driven technologies amplify the environmental immersion typical of orality: users interact with an environment induced as an outer world, perceived through senses, not through “inner vision”.
Second, voice interfaces allow conversational interaction, in which interlocutors rely on each other to develop a narrative or conversation. This is completely different from a literate narrative, where a writer or reader can rely only on the structure of thought and speech in developing a narrative.
This rewires the brain, of course: instead of “self-immersion,” cognitive delay for deliberation, and linear consideration of ideas and means of their expression, the user of voice devices engages in exchange, which is naturally impulsive, reactive, and requires emotional involvement rather than rational contemplation. Any voice-driven and interactional medium encourages the dominance of emotionality over rationality and reverses many other essential features of literacy.
However, I believe texting will hold a strong position in users’ habits of communicating with each other and smart devices or AI—at least until mind upload happens, when no mediation—text or speech—will be needed at all. But until then, texting will remain the dominant medium of digital orality.
The reason is simple: the physical isolation of digital users, especially digital natives. Due to the comfort and intimacy of personal devices, they are conditioned to maintain strict physical and social boundaries, hence the growing social anxiety of younger generations. They will not ask AI in public—they will text it. It’s more intimate and comfortable.
No less important: texted conversation is storable and shareable. It’s convenient to share or refer to. Finally, texting allows embedding visuals—emojis, GIFs, reels, memes, etc. This is a very important part of digital conversation and self-expression.
That’s why voice interfaces, while convenient in certain circumstances, will not replace texting.
Katherine: How might the resurgence of audio-driven content such as podcasts, audiobooks, and voice memos reshape journalism and storytelling practices?
Andrey: I think podcasts and audiobooks, added to listening music while driving a car, have displaced much of talk radio and news radio for drivers. Radio, one of the last old media comparatively unaffected by the internet, survived precisely because drivers couldn’t use their hands or eyes while driving thus protecting radio consumption from touchscreens.
It’s not a coincidence that the share of radio in the daily media diet and the time spent on radio have been approximately the same—one hour. As soon as self-driving cars free drivers’ hands and eyes, radio share will shrink and take its place somewhere near newspapers among endangered species—this is already happening.
However, some activities require hands and eyes but leave ears free for parallel media consumption. Radio will share this niche with podcasts and audiobooks. Anyone producing audio content should remember it is a secondary, background medium.
As for journalism, audio tools have furthered what the internet already started—the emancipation of authorship.
As for journalism, audio tools have furthered what the internet already started—the emancipation of authorship. Anyone can now participate, competing with professional journalism. In the last election, the most influential TV/audio medium was not Fox or CNN but the Joe Rogan podcast.
Audio-video has a major flaw—you cannot skim through audio in the same way as through text. Additionally, audio-video is not easily quotable without transcription, which is crucial for many. I don’t listen to podcasts; to me, they are a waste of time. If a podcast is relevant to my interests, it will be delivered to me in condensed form through resonance on social media.
So, the advantage of audio-video is its suitability as a secondary medium for parallel, background consumption, while its downside is the inability to skim content or easily quote it.
Katherine: What skills or literacies might be necessary for people to effectively navigate our changing media ecosystem?
Andrey: Literacy structured the world in the pattern of a catalog. Education was essentially the study of the catalog of knowledge to enable access to any other, more specialized knowledge.
The first websites were organized like books or libraries—with tables of contents or catalogs. The search box killed the catalog. There is no need to keep in mind the catalog pattern of your computer (directory tree), or your knowledge, or your world, when you can simply ask the search box on your computer, search engine, or generative AI.
With the search box, knowledge acquisition shifted from theorizing and reading, typical of literacy, to asking and talking, typical of orality. Consequently, the crucial skill in this mode of operation is prompt literacy—how to ask to get the best answer. Moreover, prompt literacy will soon become a matter of safety when we start prompting smart cars, smart homes, and anything smart with the capacity for physical action. With wrong prompt, smart device can hurt you socially or physically.
Prompt literacy, however, kills traditional print literacy. A search query turns the logically structured world into a pile of garbage where you need to grab what you need. The more proficient you are at picking exactly what you need, the less the rest of it needs structure. So, prompt proficiency kills the culture that was based on print literacy.
Another crucial media skill is learning not how to use a medium, but how not to use it. There should be anti-media literacy programs—how not to use media. Media evolution uses our hormonal stimuli for finding, sharing, socializing, thus fostering dopamine addiction to media use. This way media evolution makes us work for it. Just as bees are sex organs to plants, to use McLuhan’s metaphor, we are the sex organs of the media world. We help the species of media evolve. They reward us with convenience and hormonal satisfaction.
This way media evolution makes us work for it. Just as bees are sex organs to plants, to use McLuhan’s metaphor, we are the sex organs of the media world. We help the species of media evolve. They reward us with convenience and hormonal satisfaction.
Understanding the hormonal nature of media consumption is crucial for media literacy, as it may help us switch off a device or switch between devices. Ultimately, media literacy is time management, and the time in question is the time of your life.
Katherine: Do you think there will be generational impacts? People like myself who are “native” to text-based digital culture vs. people who are native to a more video-centric digital culture vs. people who are from say, a TV-based culture?
Oh, yes. We, digital migrants, lived in times without personal digital devices, so we have experience with alternative communication. We still think digital use is a choice, an option. It is not the case for a person who has consumed touchscreens since toddler age.
Digital natives are conditioned by touchscreens and digital orality, as it’s the only mode of mediation of the world they know. Parents bribe babies with tablets to buy some child-free time; kids go to video games with conversational interfaces, then social media. This all fosters a completely different cognitive type in younger generations.
Predigital people generally know that significant effort brings significant and multilayered rewards. Reading Dostoevsky requires significant effort but brings not just intellectual epiphany but also social status and self-actualization. Building a romantic relationship requires long efforts but brings not just sex, but the comfort of marriage and the security of family. The sizable reward requires a sizable effort – this was the essence of the effort-reward system in the physical world.
Digital devices reward mere clicks, but the reward is also subtle. It never satisfies – it just keeps the user using the device. This radically rewires the effort-reward neurophysiological circuits. Digital media reward mere presence – just click to show yourself, your preferences – and therefore, mere presence, not effort, becomes something valuable. On digital platforms, “to do” is not as important as in the physical world; what matters is “to be” – to indicate your presence.
This cognitive setting leads to tectonic cultural consequences. The prevalence of “to be” over “to do” leads to the snowflake generation and identity politics, where identity trumps merit. It’s not important what you do; it’s important what you are – and so people see identity as credentials and demand rewards or penalties based on identities, not deeds.
Another outcome of the digital media shift is the fading ability of individuals to make long-term efforts. The brain is not conditioned to work hard and long when the effort worthy of reward is a mere click. As a result, education degrades, careers become harder to pursue, personal lives become difficult to build, etc. Overall, social anxiety grows.
Dealing with this issue starts with parenting. As a general rule, kids’ access to types of media should repeat the stages of humankind’s media evolution – physical toys and active games, listening to bards (parents), reading, electronic media, and only then, sometime around the age of 14, touchscreen devices. If the order is broken and digital devices come before toys and books, the brain won’t receive the neural exercise associated with previous media – eye-hand coordination, physical space orientation, concentration, diligence, long effort, and delayed reward.
However, the world has already switched from print media to digital devices, and we live inside the shift from print literacy to digital orality. No personal strategy can cancel or reverse this shift, so we need to get used to it.
Katherine: How might the blending of human speech and AI-generated speech reshape our perception? What impacts do you foresee it having?
I can hardly suggest anything original in addition to deepfakes.
This interview was adapted from another article that ultimately was killed. Unfortunately, this impacted the questions I asked and the length of this conversation.





Here you can listen to Marshall McLuhan himself -- now a "digital spirit presence" on YouTube -- speaking to us from the year 1967...
Part 1: https://youtu.be/c7TKg2GGkZ0
Part 2: https://youtu.be/wfDS5YsasYw
I was kinda nodding along most of the time thinking okay, yeah cool, tracking, but a couple things made me feel Andrey was less credible.
The weirdest, non-realistic take is that you can’t give touchscreens devices to a kid until they are 14. If you only give your kids personal access to emergent / latest technology when they are 4 years from being an adult, they will lag behind their tech savvy peers, which can affect educational and career opportunities in a competitive world, not to mention social conformity pressures.
I also thought he was being dismissive of what he refers to as “snowflake culture.” I feel that I don’t have to agree with “the snowflakes” and still understand their point that not everything is as simple and straightforward as the status quo may make you think. For example the 13/50 meme. These days some use it for undocumented migrant / illegal aliens, but it is mostly attributed to “black people are only 13% of the population but commit 50% of the crime.” There are so many rebuttals that clarify the inaccuracy of this propagandizing meme, as well solid statistics that show how some high rates can be explained outside of just being plainly racist against black people. So yeah not everyone successful is because of real merit, and by and large, I’m not talking about snowflakes ❄️