The first day changes because we ne to get the general categories. This first day we will only have a list of keywords. And we will get the categories in the next step. From the second day onwards we will already have the categories, however, in the second step we will check again that there are no new categories (this will allow us to detect emerging categories).
Step 2. Clustering:
Unlike the previous case, we first extract the new categories via GPT 3.5 (let’s say we get 250). We group them in GPT 4 (they are ruc to 100). Then, and as we did in the previous case, we launch the keywords so that GPT 3.5 classifies them for us.
Step 3. The play-off:
In the last step, we will try to recover the poorly classifi or unclassifi keywords with another loop and mixing the keywords again. But in this case instagram data we will do it with 4. In general the result is satisfactory and all the keywords are classifi.
But how would this flow be integrat?
Let’s see an example through this Google Cloud functional diagram:
The operation would be as follows:
The project will operate using two paying at the table: the complete guide for restaurateurs databases:
GSC table
Auxiliary table in which the equivalences of different keywords and clusters will be sav (this table will grow day by day bas on the new keywords that are not repeat).
For day X we will keep
The keywords with which no equivalence is found in the auxiliary table. This will allow:
Ruce volatility.
Ruce costs.
Ruce the time spent.
With 3.5-turbo (price from 0.001$/1k tokens) we usa data will get the categories (if these are not already contemplat in our auxiliary table) in which the keywords of the day should be classifi.
With GPT Chat 4 (price from 0.03$/1k tokens) we will refine the process and group these first categories (going from >230 to 50-70 categories).