Close Menu
    Trending
    • 09211905260 – شماره خاله #شماره خاله تهران #شماره خاله تهرانپارس
    • The Evolution of Image Recognition Technology with Deep Learning | by KASATA – TechVoyager | May, 2025
    • 10 Machine Learning Internships in India (2025)
    • Boost Your Resume with ChatGPT & Automation E-Degree, Now $19.97
    • AI Tool to Combat Health Insurance Denials | by Artificial Intelligence + | May, 2025
    • Femtech CEO on Leadership: Don’t ‘Need More Masculine Energy’
    • Prediksi Harga Laptop Menggunakan Random Forest Regression | by Iftahli Nurol Ilmi | May, 2025
    • 4 Reminders Every Mompreneur Needs This Mother’s Day
    Finance StarGate
    • Home
    • Artificial Intelligence
    • AI Technology
    • Data Science
    • Machine Learning
    • Finance
    • Passive Income
    Finance StarGate
    Home»Machine Learning»Web App Automation using custom trained YOLOv8 model and Playwright | by Shyamchandar | May, 2025
    Machine Learning

    Web App Automation using custom trained YOLOv8 model and Playwright | by Shyamchandar | May, 2025

    FinanceStarGateBy FinanceStarGateMay 11, 2025No Comments4 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    On the planet of automation testing and UI interplay, typical strategies depend on hardcoded selectors and DOM-based interactions. However what if we might work together with a webpage the identical manner a human does — by taking a look at it?

    On this mission, I discover a novel strategy: coaching a YOLOv8 object detection mannequin to visually detect internet parts, and utilizing Microsoft Playwright to carry out actions based mostly on these detections. The result’s a fusion of laptop imaginative and prescient and browser automation that opens up thrilling potentialities in check automation and accessibility.

    Conventional UI automation instruments like Selenium and Playwright rely closely on XPath, CSS selectors, or ingredient IDs, which could be brittle when the UI adjustments. I needed to discover a strategy to visually establish and work together with parts, simply as a human tester would.

    • Mannequin: YOLOv8, a quick and highly effective object detection mannequin.
    • Dataset Annotation: Created utilizing Roboflow.
    • Automation: Browser management by way of Playwright.
    • Goal: Detect parts like buttons, enter fields, or checkboxes visually and work together with them (click on, kind, and so forth.) with out counting on selectors.
    • Python 3.x
    • YOLOv8 (Ultralytics)
    • Roboflow (for picture annotation and dataset era)
    • Playwright (for browser automation)
    • OpenCV (for picture processing)

    To make sure an AI mannequin achieves correct efficiency, it should be skilled on a considerable and high-quality dataset. For this mission, I utilized ChatGPT to generate random hyperlinks and employed the next Python code to provoke the method and seize screenshots.

    import os
    import asyncio
    from playwright.async_api import async_playwright

    login_urls = [
    # Email Services
    "https://accounts.google.com/signin",
    "https://outlook.live.com/owa/",
    "https://login.yahoo.com/",
    "https://mail.protonmail.com/login",
    "https://accounts.zoho.com/signin",
    "https://login.aol.com/",
    "https://www.gmx.com/login/",
    "https://www.mail.com/int/",
    "https://www.icloud.com/mail",
    "https://www.fastmail.com/login/",

    # Social Media
    "https://www.facebook.com/login/",
    "https://www.instagram.com/accounts/login/",
    "https://twitter.com/login",
    "https://www.linkedin.com/login",
    "https://accounts.snapchat.com/accounts/login",
    "https://www.tiktok.com/login",
    "https://www.pinterest.com/login/",
    "https://www.reddit.com/login/",
    "https://www.tumblr.com/login",
    "https://www.quora.com/login",

    # Productivity
    "https://workspace.google.com/",
    "https://www.office.com/",
    "https://slack.com/signin",
    "https://zoom.us/signin",
    "https://trello.com/login",
    "https://www.notion.so/login",
    "https://app.asana.com/",
    "https://launchpad.37signals.com/",
    "https://auth.monday.com/auth/login",
    "https://app.clickup.com/login",

    # Finance
    "https://www.paypal.com/signin",
    "https://dashboard.stripe.com/login",
    "https://dashboard.razorpay.com/signin",
    "https://pay.google.com/gp/w/u/0/home/signup",
    "https://squareup.com/login",
    "https://account.venmo.com/login",
    "https://cash.app/login",
    "https://login.payoneer.com/",
    "https://wise.com/login",
    "https://www.xoom.com/signin",

    # Cloud Storage
    "https://drive.google.com/",
    "https://www.dropbox.com/login",
    "https://onedrive.live.com/",
    "https://account.box.com/login",
    "https://www.icloud.com/",
    "https://my.pcloud.com/#page=login",
    "https://mega.nz/login",
    "https://app.sync.com/",
    "https://web.tresorit.com/login",
    "https://www.mediafire.com/login/",

    # Developer Platforms
    "https://github.com/login",
    "https://gitlab.com/users/sign_in",
    "https://bitbucket.org/account/signin/",
    "https://stackoverflow.com/users/login",
    "https://id.heroku.com/login",
    "https://cloud.digitalocean.com/login",
    "https://signin.aws.amazon.com/signin",
    "https://portal.azure.com/",
    "https://console.cloud.google.com/",
    "https://app.netlify.com/login",

    # E-commerce
    "https://www.amazon.com/ap/signin",
    "https://www.flipkart.com/account/login",
    "https://signin.ebay.com/",
    "https://www.etsy.com/signin",
    "https://accounts.shopify.com/store-login",
    "https://www.walmart.com/account/login",
    "https://login.aliexpress.com/",
    "https://www.target.com/login",
    "https://www.bestbuy.com/identity/global/signin",
    "https://www.myntra.com/login",

    # Education
    "https://www.coursera.org/?authMode=login",
    "https://courses.edx.org/login",
    "https://www.udemy.com/join/login-popup/",
    "https://www.khanacademy.org/login",
    "https://www.duolingo.com/log-in",
    "https://auth.udacity.com/sign-in",
    "https://www.skillshare.com/login",
    "https://www.linkedin.com/learning-login/",
    "https://www.futurelearn.com/sign-in",
    "https://app.pluralsight.com/id/",

    # Healthcare
    "https://mychart.com/login",
    "https://member.webmd.com/login",
    "https://www.healthline.com/login",
    "https://www.zocdoc.com/login",
    "https://www.practo.com/login",
    "https://www.1mg.com/login",
    "https://www.apollo247.com/login",
    "https://www.netmeds.com/customer/account/login",
    "https://pharmeasy.in/login",
    "https://account.docusign.com/",

    # Gaming
    "https://store.steampowered.com/login/",
    "https://www.epicgames.com/id/login",
    "https://www.origin.com/login",
    "https://us.battle.net/login/en/",
    "https://login.live.com/",
    "https://www.playstation.com/en-in/sign-in/",
    "https://accounts.nintendo.com/login",
    "https://www.gog.com/account/login",
    "https://www.twitch.tv/login",
    "https://discord.com/login",
    ]

    output_folder = "screenshots"
    os.makedirs(output_folder, exist_ok=True)

    async def capture_screenshots():
    async with async_playwright() as p:
    browser = await p.chromium.launch(headless=True)
    context = await browser.new_context(viewport={"width": 1280, "top": 800})

    for index, url in enumerate(login_urls):
    strive:
    web page = await context.new_page()
    await web page.goto(url, timeout=60000)
    area = url.break up("//")[1].break up("/")[0].substitute(".", "_")
    filename = f"{index+1:03d}_{area}.png"
    path = os.path.be a part of(output_folder, filename)
    await web page.screenshot(path=path)
    print(f"Captured: {path}")
    await web page.shut()
    besides Exception as e:
    print(f"Didn't seize {url}: {e}")

    await browser.shut()

    asyncio.run(capture_screenshots())

    After acquiring the photographs from varied internet functions, we should add them to Roboflow for annotation. Utilizing the annotation instrument, we are going to draw rectangular packing containers across the buttons seen within the internet utility photographs. This course of must be repeated for all photographs containing buttons.

    Examples of annotated photographs

    After annotating all the photographs, we should export them in YOLOv8 format, together with each coaching and validation datasets together with their corresponding photographs and labels.

    On this mission, we are going to deal with coaching the mannequin particularly to establish buttons on internet pages. Moreover, we now have the aptitude to coach the mannequin to acknowledge varied internet parts similar to hyperlinks, dropdowns, checkboxes, and extra.

    from ultralytics import YOLO

    mannequin = YOLO("yolov8n.pt")
    mannequin.practice(
    information="information.yaml", # path to your information config
    epochs=50,
    imgsz=640,
    batch=16,
    gadget='cpu' # change to 'cpu' if no GPU
    )

    Pattern information.yaml file

    path: dataset
    practice: practice
    val: val

    nc: 1
    names: ['buttons']

    Upon executing the script, the mannequin will endure coaching, and the outcomes shall be saved within the runs listing.

    Confusion Matrix

    F1 — Curve

    General Outcomes

    import cv2
    import pyautogui
    from ultralytics import YOLO
    import numpy as np
    import time

    from playwright.sync_api import sync_playwright

    mannequin = YOLO("greatest.pt")

    with sync_playwright() as p:
    browser = p.chromium.launch(headless=False)
    web page = browser.new_page()
    web page.goto("https://app.sprybe.ai")
    web page.screenshot(path="web page.png", full_page=True)

    outcomes = mannequin("web page.png")
    packing containers = outcomes[0].packing containers.xyxy
    x1, y1, x2, y2 = packing containers[0]
    x_center = float((x1 + x2) / 2)
    y_center = float((y1 + y2) / 2)
    web page.mouse.click on(x_center, y_center)
    time.sleep(10)
    browser.shut()

    packing containers.xyxy provides bounding packing containers for detected objects within the format:
    [x1, y1, x2, y2], the place:

    • (x1, y1) is the top-left nook of the field.
    • (x2, y2) is the bottom-right nook.

    Our skilled mannequin verifies that that is 95% more likely to be a button.

    x_center = float((x1 + x2) / 2)
    y_center = float((y1 + y2) / 2)

    Above steps calculates the middle coordinates of the bounding field, i.e., the center of the detected ingredient.

    web page.mouse.click on(x_center, y_center)

    Above step makes use of Playwright (an online automation library) to simulate a mouse click on on the heart of the detected object on the present web page.

    • Selector-free UI testing
    • AI-powered browser bots

    I examined the mannequin on unseen internet pages. Detection accuracy was constantly excessive. Clicks by way of Playwright matched UI targets precisely even on dynamic layouts.

    Execution Hyperlink — https://youtu.be/yJOHjlVUCmE

    • Add OCR to learn ingredient labels
    • Bundle as a no-code instrument for testers
    • Practice with multilingual UI datasets
    • NLP to transform handbook check instances to script much less automation

    This mission bridges customized construct yolo mannequin and Playwright, providing a human-like strategy to check and work together with internet pages — no DOM selectors required.





    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleWarren Buffett Doesn’t Believe in 10,000 Hours of Practice
    Next Article การวิเคราะห์ผลการศึกษาพื้นคอนกรีตดาดฟ้าที่มีความชื้นสูง | by MATLAB BKK | May, 2025
    FinanceStarGate

    Related Posts

    Machine Learning

    09211905260 – شماره خاله #شماره خاله تهران #شماره خاله تهرانپارس

    May 12, 2025
    Machine Learning

    The Evolution of Image Recognition Technology with Deep Learning | by KASATA – TechVoyager | May, 2025

    May 12, 2025
    Machine Learning

    10 Machine Learning Internships in India (2025)

    May 11, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    The Workday Is Shorter, But Productivity Is Up: New Study

    March 18, 2025

    Git & GitHub: The Essential Guide for Corporate Teams | by Ashutosh Bhaskar | Feb, 2025

    February 4, 2025

    Starbucks Cuts the Number of Drinks Allowed in Mobile Orders

    February 7, 2025

    3.6 Million Patents Were Filed in 2023 Alone — This Is How the Most Successful Ones Got Approved

    April 9, 2025

    6 Creative Ways to Improve Internal Communications at Work

    April 25, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    Most Popular

    10 tax-related policies that would help Canada win

    March 25, 2025

    Google Antitrust Case: ‘Illegal Monopoly,’ Federal Judge Rules

    April 18, 2025

    These Sleep Earbuds Can be Perfect for the Office, Now 25% Off

    May 6, 2025
    Our Picks

    What is ANOVA? Types of ANOVA and Their Applications | by Meriç Özcan | Feb, 2025

    February 5, 2025

    Modern GUI Applications for Computer Vision in Python

    May 1, 2025

    The Impact of GenAI and Its Implications for Data Scientists

    March 14, 2025
    Categories
    • AI Technology
    • Artificial Intelligence
    • Data Science
    • Finance
    • Machine Learning
    • Passive Income
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2025 Financestargate.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.