Ability to edit the web page 'PAGE CONTENT' box

Hi Elfsight, absolutely loving the AI Chat Widget so far!

Our web host unfortunately blocks most web page crawls from the crawler used by Elfsight for the AI Chat Widget, we could possibly get around this ourselves by gathering the data via a separate method and manually editing the text content of the affected web pages.

If/Until this becomes a feature, our next best solution is to upload summaries of those pages as text blocks or via a .txt file, but this maybe improve the workflow for people in a similar situation, and keep all the webpage related content in a single place.

Thank you for taking the time to read :grin:

Hello there and welcome to the Community, @S4LSam :waving_hand:

I see that you’ve previously discussed this issue with my colleague Mike and he recommended you to set up an exception for requests that include the X-Robots-Txt: JinaReader header. And if the issue still persists, clicking on the Retrain button should resolve it.

Heve you already tried taking these steps to troubleshoot the issue?

Hi Max,

Thank you for fast response.

Our web host (SiteGround) only allows a maximum of 5 different /16 ranges for allow-listing on VPS hosting, and does not allow us to set up exceptions any other way (such as your header idea, I was excited to try this out too! :sweat_smile: ).

In my testing when first being affected, I identified the most used /16 ranges and requested for them to be added as exceptions, however within days this was ineffective, and would require lots of testing and trial and error work on my end to keep on top of.

As a result I believe may be speaking on behalf of others when I mention this!

And lastly to answer your question, we have found retraining to unfortunately only work a small fraction of the time, with little rhyme or reason to when it will work (again, because are part of a more managed hosting plan), it is not something our host willingly expands on upon when I request to know more!

Thank you,

Sam

Thank you so much for clarification!

Just to confirm if I got all details right: when you add a page, you get an error text in the Page Content section and you’d like to manually replace it with the relevant content. Am I right?

Hi Max,

That’s right, in our case Cloudfront blocks the crawler before anything our end that we might have more control over.

Here’s an example of an affected page, most sites we scan return 50-80% of pages like this, sometimes spamming the retrain works but that doesn’t feel right to do, in the case of this particular widget, I have been revisiting to retrain every couple of days for weeks and it won’t change! All of this of course amounts to how our websites are protected, nothing to do with you, but it allows me to emphasize that it would be a lot quicker for me to summarize the affected pages using different methods, and replace the title and content manually, should I choose to.

Thank you

Sam

Got it, thanks!

We’ll try to think about this enhancement in the future, especially if more users support the idea :slightly_smiling_face: