• Moonrise2473@feddit.it
    link
    fedilink
    arrow-up
    4
    ·
    edit-2
    3 months ago

    A search engine can’t pay a website for having the honor of bringing them visits and ad views.

    Fuck reddit, get delisted, no problem.

    Weird that google is ignoring their robots.txt though.

    Even if they pay them for being able to say that glue is perfect on pizza, having

    User-agent: *
    Disallow: /
    

    should block googlebot too. That means google programmed an exception on googlebot to ignore robots.txt on that domain and that shouldn’t be done. What’s the purpose of that file then?

    Because robots.txt is completely based on honor (there’s no need to pretend being another bot, could just ignore it), should be

    User-agent: Googlebot
    Disallow:
    User-agent: *
    Disallow: /
    
      • DaGeek247@fedia.io
        link
        fedilink
        arrow-up
        5
        ·
        3 months ago

        My robots.txt has been respected by every bot that visited it in the past three months. I know this because i wrote a page that IP bans anything that visits it, and l also put it as a not allowed spot in the robots.txt file.

        I’ve only gotten like, 20 visits in the past three months though, so, very small sample size.

        • mozz@mbin.grits.dev
          link
          fedilink
          arrow-up
          4
          ·
          3 months ago

          I know this because i wrote a page that IP bans anything that visits it, and l also put it as a not allowed spot in the robots.txt file.

          This is fuckin GENIUS

          • Moonrise2473@feddit.it
            link
            fedilink
            arrow-up
            2
            ·
            3 months ago

            only if you don’t want any visits except from yourself, because this removes your site from any search engine

            should write a “disallow: /juicy-content” and then block anything that tries to access that page (only bad bots would follow that path)