What is the best way to begin learning about browser automation?

Idoligna

Active member
Joined
Sep 6, 2017
Messages
175
Hello!

I have a background in programming - I hold a Bachelor's degree in Information Technology. However, the problem is that I learned it about 10 years ago and have forgotten most of it. We studied languages such as Java, Python, and PHP to some extent.

I am interested in creating my own bots for social media automation, including tasks like sending private messages with multiple accounts at the same time, scraping content, posting, and upvoting. I have read that there are numerous things you need to fake about browsers, such as canvas, IP, and user id, among others. I have no idea where to begin or how to proceed.

Do you have any suggestions or recommendations for me?
 
Custom bots:

  1. For building custom bots, you can use Selenium with Python. To get started, check out the book "Automate Boring Stuff with Python" as it offers a wealth of knowledge. Learning Python is essential if you want to work effectively on bots.
  2. Alternatively, you can use Cheerio with Node.js (JavaScript). Some find this combination more intuitive, but learning JavaScript is necessary. With JavaScript, you can also control the browser seamlessly.
  3. If you need more advanced capabilities, consider using C#. It provides greater control and allows you to create interfaces for your bots.
Already available software: If you prefer a quicker way to start, you can rely on Zennoposter. While there might be a learning curve, it is a decent option to begin with. Moreover, you can find custom-made templates at affordable prices, and there are many free templates available for learning purposes. One significant advantage of Zennoposter is its intuitive user interface, making it easy to use and learn. Even though I'm usually hesitant about using such software, Zennoposter's design makes it user-friendly. Furthermore, Zennoposter is highly customizable, allowing you to include custom C# code snippets for precise control over your tasks.

(Note: The original text provided mentions three methods for building custom bots and discusses the advantages of Zennoposter as an already available software. The paraphrased version retains the main points while presenting them in a rephrased manner.)
 
I want to use Zennoposter, but it appears outdated due to social networks implementing advanced browser fingerprinting techniques. How can I acquire more knowledge about browser fingerprinting and similar methods to avoid getting banned?
 
I employ ubot studio for basic tasks, as you mentioned that nowadays, relying solely on ubot or zenposter is insufficient.
 
Alternatively, you may take into account Winautomation, though I believe they were recently acquired by a major company, though I'm uncertain about the details.
 
One of the simplest and most effective ways to begin with automation is by gaining familiarity with its principles. This involves understanding how automation functions, particularly when it comes to working with different elements. When automating, you may need to perform tasks such as inputting text, clicking buttons, and possibly more complex actions.

Once you delve deeper into the mechanics of web automation, you'll have a better grasp of what I mean. I recommend using Puppeteer as your tool of choice. It's an npm module that offers remarkable ease of use, supported by abundant documentation and plugins to facilitate your automation endeavors. You can find Puppeteer on npmjs.com.
 
While I lack personal experience, I have come across numerous people who have used selenium.
 
When working with nodejs (javascript), my main choice is puppeteer, and I complement it with the puppeteer-extra-stealth plugin to imitate unique users, ensuring that I can bypass bot detection mechanisms. To further enhance my capabilities, I also leverage ali's anyproxy, which allows me to connect through a residential proxy and intercept SSL requests whenever necessary.
 
The issue with Puppeteer, nightmare.js, and similar libraries that use complete browser emulation is that if you make mistakes during the implementation, your bot will end up consuming an enormous amount of resources. This is acceptable if scaling is not a top priority.

Alternatively, you can adopt a different method that relies solely on HTTP requests to mimic actions such as authentication and form submission. Although this approach is quite laborious to implement, it scales exceptionally well.
 
The issue with Puppeteer, nightmare.js, and similar libraries that use complete browser emulation is that if you make mistakes during the implementation, your bot will end up consuming an enormous amount of resources. This is acceptable if scaling is not a top priority.

Alternatively, you can adopt a different method that relies solely on HTTP requests to mimic actions such as authentication and form submission. Although this approach is quite laborious to implement, it scales exceptionally well.
We used to accomplish this task using curl, but as many websites transition to Single Page Applications (SPAs) built with frameworks like react, next, vue, nuxt, etc., the HTML content often consists of a placeholder div for the application. Consequently, you require a tool that can render the framework, and the simplest option is a real browser.

Indeed, puppeteer (chromium) can be fine-tuned, and we managed to reach a peak of 4000 threads load-balanced across 5 dedicated servers, each with 32GB RAM and an i7 CPU running Ubuntu. Interestingly, the i7 CPUs outperformed the xeons, likely due to chromium using some instruction sets that are better optimized for desktop CPUs.
 
When working with nodejs (javascript), my main choice is puppeteer, and I complement it with the puppeteer-extra-stealth plugin to imitate unique users, ensuring that I can bypass bot detection mechanisms. To further enhance my capabilities, I also leverage ali's anyproxy, which allows me to connect through a residential proxy and intercept SSL requests whenever necessary.
Wow, I appreciate the link to the additional stealth plugin! There's a wealth of information on the wiki, which provides a great starting point. Although it feels overwhelming right now, I hope to gradually develop a demo program that automates taking a screenshot of Google. After that, I'll be ready to conquer social media sites!
 
Gain knowledge on simulating particular web activities such as logging in, posting on forums, and more. Monitor the requests using a local proxy like Burp. If fortunate, complete automation can be achieved using regular HTTP requests, provided the website isn't AJAX-based or has CAPTCHAs. However, for AJAX-based or complex traffic simulation, using Selenium WebDriver with Python bindings could serve as a viable solution.
 
Back
Top Bottom