It’s time for another conversation on The Search Session. I’m Gianluca Fiorelli, and joining me from Turkey is Metehan Yeşilyurt, who works at the intersection of AI, machine learning, and SEO. We talk about AI search, the limits of optimization, and what understanding LLM behavior really changes for SEO.
Key takeaways:
Classic SEO vs. AI Search: moving from deterministic to probabilistic systems, SEO fundamentals continue to support AI optimization.
Google’s AI Overview vs. AI Mode vs. Web Guide: exploring how distinct they are and how they impact visibility, traffic, and SEO strategy in the evolving SERP landscape.
Tracking visibility in AI Search: with limited direct data, combining SEO tools, Vertex AI, regex filters, and Bing offers a better alternative to prompt-tracking tools.
Optimizing visual content for AI Search: how SEOs must go beyond alt text by combining image context, premium attributes, and user-friendly metadata.
Analyzing Perplexity’s approach: how follow-ups, citations, and embedding similarity help reduce hallucinations and offer a unique model among AI answer engines.
Foundations vs. Amplification in AI Search: Metehan’s effective optimization starts with technical audits, internal linking, and embedding analysis, aligned with real user intent.
Using log file analysis to understand AI bot behavior in AI Search: identifying which pages LLMs access versus those popular with humans reveals retrieval patterns.
Spam and manipulation in AI Search: recency bias, lost-in-the-middle issues, data poisoning, and prompt injection drive the need for stronger spam-fighting systems.
Plenty of takeaways to reflect on in this conversation. Let’s get started.
Video Chapters
Transcript
Gianluca Fiorelli: Hi, I’m Gianluca Fiorelli, and welcome back to The Search Session. Today, we’re having as a guest one of those new faces who has really emerged in the SEO and AI search ecosystem over the last 12 to 18 months, with a really strong voice.
If I had to imagine our guest as a little boy, he’d probably be the classic kid who gets a toy as a gift but then breaks it apart because he wants to understand how it works.
He’s from Turkey, and I’m very happy to have another Turkish representative of the community here on The Search Session. He lives in Ankara, and his name is—let’s try to pronounce it correctly—Metehan Yeşilyurt. How are you doing?
Metehan Yeşilyurt: Yes! Thanks for having me today. It’s an honor to talk with you. I'm super excited about today, yes.
Gianluca Fiorelli: Okay! So, how are things going in Turkey, in Ankara? Is it getting colder, like it is here in Spain?
Metehan Yeşilyurt: Yes, it’s getting colder, but generally, it’s colder than Spain.
Gianluca Fiorelli: Yes, because you are in the center of Turkey.
Defining the Landscape: SEO vs. GEO vs. AEO
Gianluca Fiorelli: Alright, let’s start our conversation. I’ll begin with my classic question: How is SEO treating you lately?
Metehan Yeşilyurt: It’s actually both easy and hard to answer. I mean both sides, really. The new shift is definitely shaping my work. In my previous company, we were targeting the U.S. market, and then the Search Generative Experience stuff just came out, and then ChatGPT was released, and I saw a dramatic increase in traffic and clicks.
Besides that, visibility was getting better, but now, here we are. So, I don’t think AIO and GEO are very different from SEO, but let’s dive in.
Gianluca Fiorelli: Yes, let’s dive in. Or, as I like to say, let’s try to dissipate the confusion with more certainty and hopefully offer some insights and ideas to better understand this new landscape that’s emerging, and it’s already here.
You just said that you don't think that AIO and GEO are so different from SEO. I agree. Actually, I think there’s a basic misunderstanding when people say that SEO and—let’s call it SEO for AI search, as I like to define it—are two different things. That misunderstanding might come from a very old idea of SEO, which is also, let’s say, diffused by the LLMs themselves.
Because if you’ve noticed, when LLMs talk about SEO, they always describe something that was fine 10 years ago: keywords and 10 blue links, when we know that the 10 blue links don’t really exist anymore, at least not in the classic sense.
So I think there’s this outdated idea of SEO that’s still hanging around, and no one has really managed to make it disappear, when, in reality, SEO is more of a framework about visibility and search.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: "Search" in the sense of a more generic search, and then we have specific approaches and specializations in SEO, like technical SEO, but also think about the different surfaces. Local SEO differs significantly from general SEO, classic SEO, news SEO, and image and visual search. So LLMs are just another surface; it’s another addition to SEO.
Gianluca Fiorelli: But if we were to distinguish the vertical of SEO for AI search—let’s call it AI search in terms of generative search, defined as Generative Engine Optimization (GEO), and AI search for things like AI Overviews or Perplexity, which is somehow Answer Engine Optimization (AEO)—from the classic SEO that we all know, what would be the slight differences you usually see and define?
Metehan Yeşilyurt: Yes, I also tend to use AI Search or AI SEO more often. I also really like using the terms AEO and GEO in general. But I believe that if you build strong SEO fundamentals—for your website, for your brand—it’s also very effective for AI search.
Things are changing, especially in the retrieval part and the citation selection, as well as the machine-generated content side, particularly in the re-ranking phase. And this applies to AI Overviews, AI Mode, ChatGPT, Perplexity, and all LLMs, actually.
And there’s even a setting in your LLM that we can’t change, but it’s called the temperature setting. So every time we ask a question or start a conversation with LLMs, they’re basically trying to mimic the nature of human conversation flow.
If we meet tomorrow, and we start with the same question, the context won’t really change. Your question would stay the same, basically. It’ll be very similar—but our wordings and our semiotics can change. So, we are shifting our focus from deterministic search engines to probabilistic systems. And I believe this is the hardest part to explain at the moment.
Gianluca Fiorelli: Yes, and I read quite an extreme definition about LLMs today. I mean, extreme, but actually quite a good analogy. It said that LLMs are like auto-suggest on steroids. Because apart from the fact that they are always suggesting something at the end of the conversation to keep it going—and maybe we can talk more about this later—there’s also how LLMs work. Being probabilistic, they are essentially trying to guess, statistically, which words follow the others in the context of a conversation. So in that sense, it really is a kind of auto-suggest on steroids.
However, when it comes to AI search and LLMs, let's talk about Google. I’m sure you’ve studied Perplexity very deeply. But let’s say, even if ChatGPT is probably the real competitor of Google, Perplexity is somehow struggling to gain a market share.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: But let’s talk specifically about Google and the variations in how it’s presenting AI to us, because we have more than one version. There’s the one we see every day, which is AI Overviews. Then we have AI Mode, which Google is now testing a lot. And we also have Web Guide.
All three are based on Gemini, so we could also place it as the third Google alternative for searching with AI. Have you had the opportunity to start seeing and studying what the real differences are between, let’s say, AI Overview, AI Mode, and, eventually, also Web Guide?
Metehan Yeşilyurt: Yes. In the great and old days, we were all chasing traffic and clicks, and it was easier compared to today, of course, with less competition. Let’s say, a decade ago, there were also some keywords or keyword variations and groups that were highly competitive. But if we focus on today, we can see millions of new pages being created with AI slop at the moment. And now, we also have indexing problems as SEOs.
I believe Google has the best ranking system ever; I think everyone can agree with me on this. But when it comes to AI Overviews and AI Mode, these are actually different products at the moment. They have different tabs.
What I can say is that we were focusing on informational queries, especially if you target long-tail keyword variations, you could drive more traffic in a short time. But within the AI Overviews, especially for recipe websites, let's say, they're losing traffic. Niche websites that heavily focus on informational content are affected.
Metehan Yeşilyurt: So, within AI Overviews, I can say yes, we’re seeing a very dramatic decrease in the number of clicks, and visibility matters most right now.
Using AI Mode has been particularly challenging for me. When I first started working on or optimizing for AI Overviews and AI Mode, it was very challenging. And, I can say, and I’m super excited, because I see these are different products at the moment. They have different tabs. And as you mentioned, we also now have Web Guide; I believe Google is testing it.
Unfortunately, we are losing clicks, and we can see sudden shifts when we open our Google Analytics tomorrow or next week. There’s a crocodile in our Search Console accounts right now, so we can really see the great decoupling happening.
So it's very challenging, and actually, I think it's good for user experience. Sometimes, people are just looking for a simple answer. I believe that, given the latest developments in the AI search industry, Google is likely in a better position now than it was a decade ago. Because query fan-out is also increasing the data search volumes right now.
So yes, it’s an interesting time, and these are technically different products. We are still trying to figure out how we can win citations or brand mentions at the moment. For the next year, I believe we can find better organic ways to increase our brand citations.
Gianluca Fiorelli: Yes, surely. And I think Google, maybe, as it’s testing things right now, also in AI Mode, is trying to find ways to make citation links, you know, the ones in the column, stand out more. Because there are tests showing that if you hover over one of those cited links in the sidebar, it presents a Web Guide–style layout of the content that’s being referenced. So that is very interesting, and maybe it’s going to increase the CTR, which is actually very, very, very small.
But in fact, people are obviously blaming Google, and especially AI Overview, because AI Mode doesn’t have this great, great usage. But the AI Overview is there. We cannot quit it, and nobody can remove AI Overview from the SERP.
And also, we’ve seen it popping up in People Also Ask and so on. People are blaming Google not only for stealing content but also for stealing traffic. But actually, we could also say that it’s an amplification of a zero-click behavior that we were already living with. The only problem is that it's enhancing that zero-click pattern.
Measuring Visibility When Prompts Are Unstable
Gianluca Fiorelli: Therefore, if in AI Search the most important metric becomes visibility, then we also need to define what kind of visibility, because you can be visible and invisible at the same time. So we have another problem to deal with: finding the correct metrics to measure visibility.
There are tools for prompt tracking, prompt-tracking tools, but more and more people are starting to doubt the validity of these kinds of tools, because we know that a prompt, as you were saying at the beginning, can offer a different answer depending on the personal conversation history of each user. So, how do you deal with this problem of measuring visibility?
Metehan Yeşilyurt: Great question. First, I wish Google Search Console or Bing Webmaster Tools, let’s say, could bring more tools or insights, actually, for the visibility part. This is a wait-for-all situation. In the last 20 years of digital marketing, attribution always comes last.
So for visibility, I’m trying to use multiple tools at the moment, third-party SEO tools. You name it. I’m testing every SEO tool right now, trying to see which queries or questions are now triggering AI Overviews.
Google Cloud Console is another friend of mine; it has a Vertex AI configuration right now, and I highly recommend using it. You can set it up for your website. It’s a little bit technical, but I’m sure you can handle it. And for measuring visibility, I’ve realized that I’ve started to use more Regex in Search Console when it comes to filtering. I’m trying to see what questions are actually visible in my Search Console queries at the moment. And we know OpenAI has an agreement with Microsoft, so Bing is another player here to use, and of course, Google Analytics.
But to measure it in an efficient way, it’s obvious we need more insights, especially from official products like Search Console and Bing, or just others, including OpenAI.
Gianluca Fiorelli: Yes, and I was recently at a conference, and we were making the hypothesis—which I think is quite realistic—that as soon as Google, and maybe also Bing, but especially ChatGPT and, I don’t know, Perplexity, find a way to monetize with advertising these platforms, they will have to offer at least some sort of a dashboard for advertisers, because if not, how can you justify asking them to advertise on something that’s a black box?
So maybe we need to wait for ads to start popping up in LLMs and AI answers, because until that moment, probably neither Google nor AI Overview will have much interest in showing how these things are appearing.
And in terms of Google, I honestly think this will be something that our friends in paid search will have, but not SEO. So we’ll end up doing something that was very common—and still quite common—which is, let’s say, creating an account in Google Ads as SEOs and putting in our credit card just to see the data. Even if we don’t create any campaign, probably this is going to be the only way in the future.
Metehan Yeşilyurt: Yes. They’re still trying to find a way to monetize the whole AI search with these kinds of new cards.
Gianluca Fiorelli: Well, Google is testing on AI Mode. And I think it was quite obvious that they started with things like Shopping.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: Because Shopping is, by default, something that can be easily embedded into an LLM answer. So now they’re also trying to do something similar by inserting local search features, like the Maps. But it's more difficult because structured data—let's call it that in general terms—is not a good fit. So it's harder for them to embed it inside.
Maybe that’s why we started to see, with Shopping, the appearance of advertisements. The problem is how to embed text ads. Because we have the legislation, and if they want to embed, let’s say, a promoted chunk into the answer, they would need to clearly state, “this chunk is sponsored by…” And that’s going to create an uncanny valley sensation when you’re reading the answer.
Because some parts are generated organically, and other parts are generated by answers. I don't know, they have to find a way. It’s still something very much in progress, under construction.
Multimodal SEO: Optimizing for Images and Video
Gianluca Fiorelli: And one thing that nobody really talks about and I’ve always been passionate about—maybe because of my past in the audiovisual industry—is that everyone talks about AI and images, but mostly in the entertainment space of generative images, like Banana, or for video, Sora, and Vo-Free, and so on. But we know that LLMs are fundamentally multimodal.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: They are starting to push the multimodal nature inside the answers. I was just doing a search on ChatGPT before our conversation—it was about which island to visit in Cape Verde—and ChatGPT was showing me images, like a gallery of photos for each island.
But nobody is really talking about how to optimize images and videos for LLMs and AI Search. And we know that this exists, as it exists for the text, for the question. And the query fan-out that Google and Gemini also have, the visual search query fan-out, let’s call it so.
And you wrote and talked about this. Can you explain a little bit about this less-known aspect of AI Search?
Metehan Yeşilyurt: Yes. You are very experienced on this topic, by the way, and I love reading your blogs, thoughts, and ideas. I mean, for every post.
As for the image part, yes, the multimodal capabilities of LLMs are not even close to Google at the moment when it comes to crawling a page and understanding the whole context, including images. Therefore, Google has the best system in this area as well.
I built a Screaming Frog custom JavaScript to identify some image gaps, some, let’s say, premium color gaps on websites at the page level. And I realized that actually, every week, new people, new users are joining and using Google Lens, or they’re basically asking questions by attaching images. Let’s say you’re watching a TV series or a movie, and you like a jacket a woman is wearing and then you can take a picture and search for it online right now.
What I realized after doing some work and running a few research experiments is that Etsy is doing an amazing job in this area. Of course, there’s also a user-generated e-commerce platform, I can say. Some sellers are insistently using premium color names in their product images and in their titles. And I noticed something interesting: I actually expected to see more Amazon results in my experiments, but instead, I saw that many niche domains were appearing in those visual search and visual query fun-outs.
What I’m trying to say is, blue isn’t always just "classic blue." It can be some other generation of blue. And some users are searching using that exact generation or variation name of color.
And as SEOs, I believe we are behind in the race at the moment. So, the visual part is very important, and we really need to use more user-friendly and maybe also bot-friendly alt text. In the last couple of years, we were only optimizing for alt texts and, let’s say, image names on the server. But now, it's beyond that, and we need to be aware of it.
Gianluca Fiorelli: Yes, it’s not new, because it was already important for Lens.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: So it was important for visual search. But I think that, in this case, we can use the same kind of analysis we do with embedding cosine similarity, but specifically for images, or even videos…
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: … in order to know where exactly to place the image inside the content text. Or, let’s think, for example, of fashion. A classic fashion product page description or the classic fashion website, which is usually 90% images and videos and 10% text. So, how to use all the information and the small amount of text that we can add in the body text, in the alt text, and in the file name In a way, that is correct but through embedding analysis?
So, using embedding cosine similarity analysis, in order to understand, “Okay, this image should also appear close to this other image because they match, because people tend to associate one image with another, and so on.”
And this could also be used, eventually, for related products, related categories, and product variants, etc., etc., etc. I think it’s a field that nobody is really talking about and it’s weird, because we know that a large part of the younger generation is using visual search massively.
Metehan Yeşilyurt: Yes. May I ask you a question here? Can you also explain for our listeners or viewers what semiotics is?
Gianluca Fiorelli: Oh, yes. This is a concept that, I think, is incredibly important. Because the same SERP, in my opinion, is moving from the classic, monolithic, interactive consensus—like, let’s say, a proposition such as the 10 blue links—to something that is more narrative.
The medium is the answer. Before, the medium was just the medium. So the medium is the universe. So, semiotic means all the signs that we put into content, meaning content as text, image, audio, video, or whatever. Everything we represent has a meaning that resonates with the people we want to target. Very simply put, there can be symbols, there can be signifiers as something, and other things.
For instance, symbols—when we see the four stars on an aggregator rating, that’s a symbol universally recognized as a quality score. And in text, let’s say, the definition "skip-the-queue ticket" is another signal. A textual one, but still a signal.
Or think of a search like, "how to get from the Ankara airport to Ankara." The SERPs themselves are full of semiotics. They present maps, they show the blue line indicating directions, and then you see the information, the icon of a car, the icon of a taxi, the icon of a bus, and so on.
So it’s important to start understanding this. And images are essential. For instance, let’s say we have an AI Overview, and we’re lucky enough to be at least one of the first three presented sources. We know that CTR is very hard. But if we work on it, the only thing that can make us stand out is the thumbnail image.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: If we use it very well, and choose very well, the thumbnail image for a search like the one I mentioned before, about which island to visit for a vacation in Cabo Verde, if we choose an image that works very well in this very small frame that’s given to a thumbnail image, which is, through our embedding analysis, very central to the topic people are searching…
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: … then probably, the eye goes immediately to the image before even reading the answer. And so, it will also stay in memory—the elements close to that image—like the title of the article, the name of the source, and so on. So yes, it’s extremely important for me—and it’s going to be even more important.
Then, we could also go into symbolic AI, which may be something that can return. It’s not really the type of AI that LLMs are using right now, but maybe it's something they will return to because it’s where the real grounding can be achieved, based on knowledge graphs, and these kinds of things. But you’re the guest in this conversation, not me. So let’s turn it back to you.
Metehan Yeşilyurt: Yes.
Perplexity vs. Google AI Overviews: Key Differences in AEO
Gianluca Fiorelli: Recently, you shared something that really resonated with me. It’s like, “Okay, for Perplexity, I needed to really dig and break it in order to understand it.” For Google, we have everything in the documentation.
Jamie Indigo does something similar, but with OpenAI. She always reports on the new documentation from OpenAI, which is very, very hidden inside their website. So now, since you’ve studied Perplexity so deeply, and let me give a bit of context here: if we think about Answer Engine Optimization, in that sense, Perplexity is the closest thing to AI Overview.
Metehan Yeşilyurt: Yes, yes.
Gianluca Fiorelli: So, if you’ve studied Perplexity, and then you’ve also studied AI Overview, because we have to study it and after reading all the documentation, which is public and very visible from Google, what are the differences between the two that you would highlight?
Because obviously, when we create content, we should try to find some common minimal denominator so we don’t drive ourselves crazy thinking, “Okay, this works for AI Overview, this works for Perplexity, and this works for ChatGPT.” At the end of the day, it’s always the same content. We can’t create a monster just because we want to be visible everywhere.
Metehan Yeşilyurt: Yes, Perplexity, if you compare it with Google and Bing and ChatGPT, has a very small market share at the moment. But what surprised me is that when I started heavily working with Perplexity a couple of months ago, I realized that, first, there are some technical layout differences, like asking or showing some follow-up questions.
And I can say that the first citation result in Perplexity also influences the follow-up questions at the end of each answer. So, it also encourages people to ask more questions. But yes, when Perplexity first entered the market, the web retrieval part was very challenging. Because if you were trying to use the web search function or grounding with LLMs, it was actually creating many hallucinative responses, even if you included the web search results.
So, what Perplexity built, based on my findings, is that they are trying to minimize, or let’s say zero down, the risk of hallucination. They’re using very high embedding similarity. And I believe Google is also using very high embedding similarity, but it depends on many cases: commercial queries, informational, navigational, etc.
But Perplexity is trying to collect more different perspective results from web searches, whether they’re using Google, as it seems so, or Bing, or just others. They’re trying to show more diversified results with high embedding similarity. And they’re running many fact-checking systems at the moment, I can say. So, they’re also successful in the market.
Embeddings, Topical Authority, and User Intent Mapping
Gianluca Fiorelli: Interesting. Somehow, there are two ways of thinking about optimizing for AI Search. One, and this is the one I recognize myself more in, even if I don’t deny the other one, is to go first with your foundation—so, technical perfection—so that all the content, of any kind, can be retrieved or used by LLMs, by Google, etc., etc., etc. Then orchestrate the architecture of your content, and obviously go on to create great content—which is all good—but you can also easily promote it and amplify it.
And then, there's the other, let's say, part of the community that says, "Go first with the amplification, because what really works is being mentioned by trusted sources, by sources trusted by AI." I think the two are complementary.
And when you are working with your clients, what is your workflow, considering these two different and complementary views?
Metehan Yeşilyurt: Yes. Great question, thanks for that. And this is a question I believe will come up even more next year for every SEO team: in-house, freelance, agencies, or any brand.
First, I start with technical SEO fundamentals. Actually, what is different in AI Search, if we compare it with chasing traditional ten blue links SEO, is that we were targeting clicks from deterministic results in Google SERPs or Bing, or others. Of course, we’re mainly talking about Google at the moment. And those were results from a deterministic system, feeding off user signals in Google’s backend or ranking systems.
Now, it's more complex. We're now chasing clicks, CTR, and visibility from machine-generated text. That’s one of the main differences in AI Search right now. So the retrieval part is very important. I start optimizing with a website audit. Of course, I’m using desktop SEO software and cloud-based tools. I’m basically testing every tool at the moment, as I mentioned.
I start with AI visibility audits, beginning with SEO fundamentals: cleaning broken links, improving the internal linking system, and improving performance. You can easily identify when a website is using hundreds of AI-generated pages. If they’re not using internal links, you can tell. And I believe Google will be able to do the same in the near future. I believe OpenAI will, too.
So, I start with the website technical audit; it’s very similar to SEO. Then I move on to mapping topical clusters of the website or brand. I try to understand how the brand is represented in the LLM’s embedding model. Let’s say Google has different embedding models, OpenAI also has different ones, and so does Claude. So I try to connect the dots between the embedding data, the website technical audit, improvements to internal linking, and creating new sections on the website.
Then, I try to understand—and I believe this is the most important part—the user intent. A couple of years ago, I was more focused on keyword research. But now, I’ve shifted my focus on user intent. I still use keyword research, of course, but now I try to collect real user questions from the web. You can use People Also Ask, and you can crawl Reddit, Quora, and some other platforms at the moment.
So, website audits, embedding data, and your representation in the embedding model and trying to match it with user intent. I have many spreadsheets right now.
Analyzing Logs to Study AI Search Bot Behavior
Gianluca Fiorelli: And talking about technical, one classic part of technical SEO is log analysis. And you talk about using log analysis in order to understand how AI bots are essentially visiting and hitting your website. And there are tons of different types of AI bots: the one for the models, the one for grounding, etc., etc. And multiply this for all the different types of LLMs now existing.
Have you seen some peculiar differences, for instance, between how the bots of OpenAI visit a website versus the bots of Gemini? Are there differences? Or do they more or less work the same?
Metehan Yeşilyurt: That’s a challenging question. I can easily say Google has the best layout parser system at the moment. And I feel like all LLM user agents are using layout parsing systems, and I believe they’re really similar at the moment and they’re not as successful as Google. And of course, you can identify user agents from analyzing log files.
But when it comes to log files, I’m asking this question to my clients or potential clients, “What are your top pages in LLMs, and what are your top pages for humans at the moment?” We can see which pages are very popular with humans by using Google Analytics.
But we can also identify the top pages for LLMs, for the retrieval/inference parts, from log files. Because we only see the traffic data in Google Analytics or other analytics platforms. And if your brand is mentioned in LLMs, let’s say in AI Mode, Gemini Deep Research, OpenAI, Perplexity, and others, it doesn’t guarantee you’ll drive traffic.
And we also know that everyone has different representations in their minds of any entity. Let’s say Apple, let’s say Samsung, or just others, everyone has different thinking mechanisms. So, if you can identify which pages are targeted for retrieval by LLMs right now, AI search engines, you can identify these pages by analyzing log files. I believe this is the most important part of my work and experiments so far.
Gianluca Fiorelli: So maybe introducing log analysis specifically for AI bots could be a more accurate way to understand visibility than generating synthetic prompts.
Metehan Yeşilyurt: Yes, yes, yes. You can try to generate synthetic prompts, or you can…
Gianluca Fiorelli: No, I was saying that maybe just analyzing the raw data in the log files, without generating synthetic prompts, could be a good way to understand the level of visibility you may already have, at least for your website, and to understand, “Okay, this section of the site is really interesting for LLMs, and these others are not. So how can we improve the visibility and the interest of the LLMs for those sections that are not regularly hit by LLM bots?” This could actually be a good way to do the kind of analysis we were talking about at the very beginning.
Spam, Hacks, and the Fragility of AI Search Systems
Gianluca Fiorelli: Let’s move to one last question about AI Search. AI Search is—as you've also said yourself—very hackable. And I don’t want to ask you for ways to actually hack LLMs, because I mean, we are honest marketers here. However, many of us would be lying if we said we’ve never tried at least one experimental hack to see how things are working.
But let’s talk about the other side. Spam is becoming very evident in LLMs.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: And how do you think OpenAI, Perplexity, and, obviously, Google, will find a way to start rolling out spam updates at the LLM level? Because when we talk about grounding, supposedly Google at least, but also OpenAI if referring to Bing Search, they rely on SERP results.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: And formally, SERP results should be, in their majority, clean of spam. The problem is that maybe the query fan-out is so deep that the spam surfaces again. So, it's a matter of increasing even more the spam detection, especially for the very, very, very long tail in the query fan-out.
But for, let's say, the answers generated by training data—because maybe this training data is using websites that have AI slop inside, that have thin content, or have spam content inside, like the infamous listicles and so on—how do you think these players could fight spam at the LLM level?
Metehan Yeşilyurt: That's another great question, and I believe this will be one of the most popular questions for the next year, “How will OpenAI fight spam?”
And I can say, Google has the best system for fighting spam, of course. We know they have more than 20 years of experience with spam topics, and I realized they have much better adversarial filtering in their API documentation, and you can read that.
So, I believe OpenAI and others need to build an in-house spam team, especially for the retrieval part, let's say. They don’t want to name it a "Search Team" like Google, but it is obvious they will need it.
Gianluca Fiorelli: They don't want to name it like that, but OpenAI is hiring SEOs.
Metehan Yeşilyurt: Yes, yes. They're hiring SEOs. That’s funny. And you can even see some job listings on the Meta sites as well. So, they need to build some in-house team to fight spam. And I believe they can start working with maybe some agencies or other companies, just like Google does for the human evaluation part.
And basically, the actual biases of LLMs are valid, especially for the open-source models. And of course, they have fine moderation systems. Let’s say it’s valid for OpenAI, ChatGPT, Perplexity, Claude, and others, like Mistral.
I can say one of the biases is recency or content freshness. There are also some academic research papers around it. You can just use a very recent date on your document and you can see you are more likely to show up in the re-ranking phase. So that’s one of them.
The second one I can say—it’s a general problem, not related to spam 100%—but "lost-in-the-middle" context problem. So LLMs are better if they try to categorize or classify content, especially in the beginning and ending parts. Basically, you can use more FAQs at the end and more tables in the middle of your content, and you can see the difference.
The third one is data poisoning. It's awful, actually. There's no way to generate gigabytes of data at the moment, but there are also some academic research papers around this topic. I guess Britney mentioned it…
Gianluca Fiorelli: With Britney, you mean Britney Muller?
Metehan Yeşilyurt: Yes, yes.
Gianluca Fiorelli: Hi, Britney!
Metehan Yeşilyurt: Hi, Britney! 250 documents are enough to poison any LLM data at the moment. And I guess people will find better ways to use data poisoning in an efficient way in 2026.
And the fourth one is that basically, for every new model release, they’re more capable of understanding some prompt injection attacks, but it’s still working. So basically, you can use some alternative sentences in your web pages, and as you can see, you can also influence the LLM's citation.
Gianluca Fiorelli: Right now you are saying to me that you can manipulate the answer of LLMs by taking E-E-A-T signals, substantially?
Metehan Yeşilyurt: Yes, you can. I don’t want to encourage this.
Gianluca Fiorelli: No, me neither. But maybe this could also be the way for LLMs to start finding ways to fight spam and to anchor the E-E-A-T signal to the knowledge graph.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: So to say, “Okay, you are saying that you are this, this, and this, but let’s say, on public knowledge graph, like Wikipedia or Wikidata, better, if you really are the brand that you are claiming to be or such an important brand as you present yourself.” And maybe not only introducing true brand signals into the game for understanding spam but also bringing back a very old concept, which is the TrustRank.
Metehan Yeşilyurt: Yes, yes, yes. They have a system like this. Because Google is much better than other LLMs, especially for the ranking part and for personalization, since they’re collecting much more data. OpenAI is trying to collect more data, but I believe that if they somehow in the future fine-tune the retrieval part with user data, they’ll also realize that they can provide more accurate and less hallucinated results.
There’s also a chain-of-thought process happening right now, and you can influence the chain-of-thought deep reasoning process or just the reasoning process just by using a few tokens. Actually, we were talking about keywords before, and now we’re talking about tokens in the context of LLMs. Because every keyword has different token IDs.
Even in different models under the same company—let's say OpenAI—they’re using a different tokenizer than Google. So you can influence the re-ranking process because some systems, some web search processes, are just using LLMs as a judge method.
So if you're using an LLM as a re-ranker, you can influence it. There are some research papers, and it’s also public—you can read posts from Andrea Volpini. He also mentioned a research paper called “Rank Anything First”, and if you use some tokens to influence LLMs at the moment—yes, you can do it. So choosing a few tokens is not only about influencing citations but also the ranking process. I’m also testing it.
Gianluca Fiorelli: Well, we are all testing it. We are in the testing phase of this part.
Metehan Yeşilyurt: Life Beyond AI Search
Gianluca Fiorelli: Okay, I think we talked too much, almost an hour, about AI.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: So, let’s conclude the conversation talking about you. So was I right when doing the presentation and imagining you like the classic leader kid who receives a game as a gift and breaks it in order to understand how it works?
Metehan Yeşilyurt: Yes. I mean, you're asking about my research in the beginning...
Gianluca Fiorelli: No, no, no. It’s more about you. So when I was presenting you, I was imagining you as the classic kid who, let’s say, at a birthday, receives a game and starts to play with the game, but then—let’s say it’s a little car—breaks it because he wants to know how it works. I was picturing this kind of figure of you.
Metehan Yeşilyurt: I respect many great names, including you, in the search space. We are trying to move in ethical ways, but somehow you need to push boundaries to understand how systems work. So sometimes I’m running around the lines, pushing boundaries at the same time. But it’s very important how you also tell people about these systems.
And as I mentioned, I love reading your posts, your ideas, and your thinking process. So I’m trying to also mimic those processes in my mind and understand systems better. And of course, at the end of the day, someone needs to pay the bills. So we are trying to make money using our skills. So yes, that was the process.
Gianluca Fiorelli: And what does Metehan like to do, apart from working and studying AI Search?
Metehan Yeşilyurt: I'm a proud dad. I have a daughter—she’s just around two years old. So, I'm working remotely at the moment in Turkey, in Ankara. Of course, I travel for some conferences or with my friends outside of the country.
I also love watching football at the moment. Maybe it’s a cliché, but I would say, “I love reading books.” And I'm obsessed with the stock market at the moment. So yes, taking care of my daughter, watching football, playing some video games and everything else comes second right now.
Gianluca Fiorelli: I mean, it’s a normal, quiet life, which is not bad at all. Not everybody is made for, I don’t know, diving or doing very extreme sports.
Metehan Yeşilyurt: Yes.
Gianluca Fiorelli: So thank you, Metehan. It was a real pleasure to have you here at The Search Session. Let’s think about, maybe one day, in the future, doing a wider conversation with others, like you cited them, Britney and Andrea. They’re also my friends, and so we could do a sort of panel about where AI Search is going—you know, this kind of fancy name. Let’s think about it. So, I thank you.
Metehan Yeşilyurt: Yes. Thank you so much.
Gianluca Fiorelli: And you, my friends, thank you for also being my guests during this episode.Remember to ring the bell and to subscribe to the channel, so you’ll be notified as soon as a new episode of The Search Session pops up. Thank you, and see you later.
Podcast Host
Gianluca Fiorelli
With almost 20 years of experience in web marketing, Gianluca Fiorelli is a Strategic and International SEO Consultant who helps businesses improve their visibility and performance on organic search. Gianluca collaborated with clients from various industries and regions, such as Glassdoor, Idealista, Rastreator.com, Outsystems, Chess.com, SIXT Ride, Vegetables by Bayer, Visit California, Gamepix, James Edition and many others.
A very active member of the SEO community, Gianluca daily shares his insights and best practices on SEO, content, Search marketing strategy and the evolution of Search on social media channels such as X, Bluesky and LinkedIn and through the blog on his website: IloveSEO.net.
stay in the loop







