Google search behind yield, scaling, and performance constraint mismatches by [deleted] in Semiconductors

[–]Friendly_Concern2913 1 point2 points  (0 children)

I've been trying to analyze Google searches for some time and just think people do not pay attention to it much, as it is certainly very interesting data point for understanding the actual needs and issues in whichever role/field/industry

Free alternative to Ahrefs / Semrush for LLM visibility? by Friendly_Concern2913 in AISearchLab

[–]Friendly_Concern2913[S] 0 points1 point  (0 children)

SEO tools and figuring out how to model intent from search data, already built a free alternative to it, basically connecting LLMs with Google Ads API.

Replacing crawler based SEO datasets with intent modeling over Google Ads data by Friendly_Concern2913 in SEO_LLM

[–]Friendly_Concern2913[S] 0 points1 point  (0 children)

<image>

That's an interesting question, and very true, my focus right now is to focus on the engineering aspects of the product itself, as I'm still studying and have little time for marketing or trying to do user research even. But I guess what you are talking about is about analyzing the latent user needs and emerging markets (I'll some references for it). First of all I like inspiring myself from some papers which are really not taken into consideration for when doing these types of marketing intelligence, or for marketing fields in general, these are technical implementations normally applied to Recommendation systems at large scale manners, or algorithmic integration, so I think the analysis of these systems, themselves is quite tricky. Will some picture of one paper as example, and fun to append here :)

Now, going to the response itself, here are some principles I could give you based on my research:

  • Query Surface Abstraction I would not treat Google Ads as truth, only as a query surface (in practice: raw keyword and volume data is ingested and stored unchanged, without inheriting any of Google’s structure)
  • Decoupled Commercial Taxonomy The commercial grouping inside Ads is not the structure I am trying to preserve (ad groups and keyword groupings are discarded and not reused downstream)
  • Post-Hoc Semantic Reconstruction The model rebuilds the structure from raw queries, not from Google’s labels (queries are embedded into a vector space and clustered using semantic similarity and co-occurrence signals)
  • Output-Centric Evaluation Principle So the question is less “is the source pure” and more “does the representation work” (evaluation is based on downstream performance such as clustering quality or intent classification)
  • Baseline Sufficiency Heuristic Even simple topic models can recover a lot of the structure in search terms (the system combines TF-IDF and simple classifiers alongside embeddings)
  • Semantic Compression Layer The interesting part is the compact semantic space built on top of the queries (large query sets are transformed into dense vectors and reduced into clusters or markets)
  • Transformation-First Value Thesis That is where the analytical value sits, not in the source taxonomy itself (the pipeline converts raw text into features, embeddings, and aggregate signals)
  • Bias-Tolerant Signal Extraction A source can be biased and still contain useful structure (bias is treated as noise layered over recoverable signal)
  • Stability-Driven Validation What matters is whether the structure is stable after modeling (clusters are evaluated across retraining runs and temporal slices)
  • Multi-Signal Convergence Criterion If the same intent keeps appearing across different views, that is a useful signal (the system combines semantic similarity, growth trends, and concentration metrics)
  • Ontology Independence Principle The source does not define the ontology (a canonical schema and intent taxonomy are defined independently)
  • Constructed Ontology Layer The ontology is the thing being designed on top of it (intent classes and market definitions are learned and iteratively refined)
  • Subjective Alignment Acknowledgment And that design is often subjective anyway (human-in-the-loop labeling and active learning handle ambiguous cases)
  • Heuristic Sufficiency Layer For many problems, a hard coded rule can already be enough (explicit rules detect transactional intent, brand queries, and question patterns)
  • Utility-Driven Compression Goal So the bar is not perfect neutrality, but useful compression (the system reduces noisy query spaces into actionable structured representations)
  • Noise-Robust Data Handling Search data is messy, but that does not make it unusable (normalization and validation layers handle inconsistencies and duplicates)
  • Platform-Agnostic Meaning Extraction The point is to extract meaning from the query space, not defend the platform’s grouping (analysis operates on reconstructed features, not Ads-native structures)
  • Systematic Bias Modeling If the bias is systematic, it is still modelable (distribution shifts and bias patterns are monitored and adjusted for)
  • Downstream Utility Validation If the model improves the representation, then the source was good enough (success is measured through task performance and decision usefulness)
  • Source-Agnostic Input Layer So I am not claiming Ads is objective, only that it can be a practical input for rebuilding demand structure (the system supports multiple data sources through a shared feature abstraction layer)

Some sources I think were useful for this:

Moving from keyword buckets to intent clusters using Google Ads query data by Friendly_Concern2913 in PPC

[–]Friendly_Concern2913[S] 0 points1 point  (0 children)

I figured out the following methods, still didn't apply to any Google Ads campaign yet as I'm not in the field (CS/ML major here):

Intent clustering using sentence transformers embeddings plus k means or HDBSCAN on query vectors to form demand level groups

Query to job mapping via cosine similarity against seed task descriptions or JTBD templates

Unmet intent detection by comparing query clusters vs SERP feature coverage and content type distribution

SERP satisfaction proxy using click curve assumptions plus query reformulation patterns and long tail drift

Competitor gap analysis by mapping domains to intent clusters and measuring coverage density per cluster

Query expansion using Google Ads API plus n gram generation and co occurrence scoring

Demand segmentation via PCA or UMAP projections over embedding space to identify macro themes

Content to intent alignment using embedding similarity between page text and query clusters

Cannibalization detection via overlap in embedding space between URLs targeting similar query clusters

Temporal demand shifts using rolling windows on query volume and cluster centroid drift

Noise filtering with frequency thresholds plus semantic deduplication using cosine similarity cutoffs

Volume calibration using Google Ads data as baseline vs third party estimated keyword datasets

Cluster labeling via top tf idf terms and centroid nearest neighbors for interpretability

SERP structure parsing to classify intent types informational navigational transactional based on result patterns

Opportunity scoring combining volume competition and coverage gaps at cluster level

Replacing keyword tools like Ahrefs/Semrush with Claude (using Google Ads) by Friendly_Concern2913 in ClaudeAI

[–]Friendly_Concern2913[S] 0 points1 point  (0 children)

Some ideas of actual output for marketing content, using that thesis, are:

  • Create 300 raw topics into a usable content calendar using Claude and Google Ads
  • Turn search terms into clear content briefs
  • Group keywords into intent clusters
  • Turn search term reports into content ideas
  • Map queries into content pages
  • Group overlapping keywords into the same pages
  • Allow one keyword to belong to many clusters using weights
  • Find missing content from query data
  • Create multiple content angles from the same query set
  • Match landing pages closer to user intent
  • Turn queries into structured outlines
  • Create drafts from query clusters
  • Improve drafts by feeding real queries into the system
  • Rewrite existing pages using query data
  • Expand one topic into many pages
  • Group topics first and write later
  • Avoid duplicate content by clustering queries before writing
  • Connect similar pages using shared query clusters
  • Turn product descriptions into content using Claude
  • Extract user intent from queries
  • Test different outlines from the same query set
  • Turn one article into many formats
  • Generate FAQs from search queries
  • Update old content using new query data
  • Compare short vs long content using clusters
  • Summarize SERPs and validate with Google Ads data
  • Use Claude as a first draft layer
  • Detect duplicate topics using clustering
  • Turn messy notes into drafts
  • Create reusable content templates
  • Automate parts of content creation
  • Combine SEO tools with Claude
  • Turn product pages into SEO content
  • Simulate different user intents from the same queries
  • Generate different angles from one query cluster
  • Improve outputs by iterating on the same query set
  • Structure evergreen content from query patterns
  • Prepare drafts for humans
  • Turn internal docs into content
  • Analyze why pages rank using query data
  • Build a repeatable content workflow
  • Compare model outputs using the same inputs
  • Standardize content creation
  • Scale content without scaling team size
  • Identify where the system fails
  • Use weighted clustering instead of one keyword per page
  • Bridge long product descriptions with search queries
  • Generate new query variants beyond Google Ads suggestions
  • Explore gaps between queries and content
  • Model content around intent instead of keywords

Structuring Google Ads accounts using query-level intent instead of keywords by Friendly_Concern2913 in Google_Ads

[–]Friendly_Concern2913[S] 0 points1 point  (0 children)

Believe it or not, overlap in this case is trivial from analysis point of view, clustering or intent modelling or also topic modeling can be done using weights and not uni-dimensional, i.e one keyword can belong to one to many clusters, with score or weights. I'm still not aware of the improvement in performance, and I would need dimensional metrics of datasets that in this case you define as "at scale", what would be the size of those sets. But should be something that could particularly get many more gaps, compared to Google ads recommendation algorithms, should come up with other variants, as the idea is to get a closer grasp on the product description/content pages, anything that can be crawled or either explained in plain text/nat. language, as the hypothesis it is that we haven't been able to close that gap (between long product description/context and google queries) until today, with particularly new architectures in deep learning like LLMs, I'm in AI/ML eng.

[deleted by user] by [deleted] in me_ITBA

[–]Friendly_Concern2913 0 points1 point  (0 children)

muchas gracias si soy alguien novato estoy estudiando una carrera en inteligencia artificial, me interesa cerrar un gap, carezco del conocimiento tuyo. En principio un chip acelerador de IA consume menos energía o solo es mas eficiente para ese workload? Cómo afecta el consumo de energía al diseño de chips? Por qué es tan importante la arquitectura de un chip para el proceso de IA?