[Solved] File downloads in headless chrome/chromium using WebDriverIO DevTools Protocol

If you have found this blog on the internet, I assure you this problem of yours gets solved here and now once and for all.

7 out of 10 times while writing web automation tests. You would be needing a way to download a file by either clicking a button or going to the URL in a headless browser being automated by a framework of your choice. Usually, its Selenium + Scrapy if you write Python depending on your use-case and hundreds of choices if you work on JavaScript. Let’s talk about the problems you are facing and how to quickly solve them.

My Setup, and problem I had

At the time of writing this, I was having difficulties finding a documented way of downloading files using WebDriverIO that works when using the Devtools Protocol. Nothing I searched and tried actually worked for me, so here I am writing about a solution that worked for me. If you are a team that provides cross-browser support for your application, then read about new automation protocols and try to migrate over to Devtools protocol. Automation through chrome driver and geckodrivers are extremely browser version specific and leads to dependency hell. Move your web automation tests to the newer, more native Devtools protocol where you can automate the browser and get access to loads of features through the debugger.

WebdriverIO version - 6.3.6

// Extract from wdio.conf.js
runner: 'local', 
automationProtocol: 'devtools',
services: ['devtools'],
capabilities: [{
        browserName: "chrome",
        'goog:chromeOptions':{
          binary: "/usr/bin/chromium",
          args: ["--headless", "--disable-gpu", "--no-sandbox"],
          prefs: {
            'safebrowsing.enabled': false,
            'safebrowsing.disable_download_protection': true,
            "download": {
                "prompt_for_download": false,
                "directory_upgrade": true,
                "default_directory": "/home/vipulgupta2048/webd/"
            },          
          }
        },
    }],

More details on the working code and setup are over here in this GitHub repository.

Solution that worked

Downloading files headlessly with any browser means, the browser instance doesn’t have or get the settings you provide them using capabilities or goog:ChromeOptions in wdio.conf.js when using the Devtools protocol. If you take a look at my sample spec, what we need to essentially do is send a command through the browser debugger to change the page’s download behavior and specify the path for it.

describe('Testing file downloads in headless browser', () => {
    it('should have the right title', () => {
        // Open the URL
        browser.url('http://www.fileformatcommons.com/txt-file-format/')
        
        // Set the download behavior of the page right before the download
        browser.cdp('Page', 'setDownloadBehavior', {
            behavior: 'allow',
            downloadPath: '/home/vipulgupta2048/webd',
        });
        
        // Clicking the download button
        $('.wp-image-36').click()
        
        // Give enough time to let the file download
        browser.pause(5000)
    })
})

This needs to be right before you hit the button to download the file. The path you mention needs to be absolute with the directory already present. Something critical to note is, through this way the browser won’t or can’t download the same file over and over again. So if you are running your test continuously then be sure to fs.unlink (delete) the file on each run through the help of hooks or cleaner functions. You can read more about debugger commands that you can use and send with the devtools-service here.

Also, make sure to add devtools and devtools-service to package.json and wdio.conf.js as applicable as without it the browser.cdp method won’t work.

// Package.json

{
  "name": "webd",
  "version": "1.0.0",
  "description": "",
  "main": "wdio.conf.js",
  "directories": {
    "test": "test"
  },
  "dependencies": {},
  "devDependencies": {
    "@wdio/cli": "^6.4.0",
    "@wdio/devtools-service": "^6.4.0",
    "@wdio/local-runner": "^6.4.0",
    "@wdio/mocha-framework": "^6.4.0",
    "@wdio/spec-reporter": "^6.4.0",
    "@wdio/sync": "^6.4.0",
    "devtools": "^6.4.0"
  },
  "scripts": {
    "test": "mocha"
  },
  "author": "",
  "license": "ISC"
}

// wdio.conf.js

runner: 'local', 
automationProtocol: 'devtools',
services: ['devtools'],

That’s it, this should help you be able to download files in using your browser in a headless mode along with specifying the right path for it, which are the two major problems that folks face while downloading files.

Other solutions I had in mind

Let’s say unfortunately my solution didn’t work out for you, don’t worry I got loads of more suggestions for your to try out.

Since, I was using Chromium. I first went for browser.sendCommand() method which is a documented method in Chromium protocols of WebDriveIO. It’s also listed on other answers on the web, but as you go on to use it. You would find an error waiting for you in the before hook, where you have placed this method. Do dig into this more and see if browser.cdp() doesn’t work for you, maybe this can.
Virtual Displays – So you want to use the browser headlessly but it’s not working out for you in any way. A solution suggested on the Chimpy GitHub issue actually was quite innovative. It was what if we can trick the browser into believing we have a display. This can be done through `xvfb` check out this comment for more details on that same.
Chrome Profiles, I had a plan if nothing else worked I would simply edit the execution command of Chromium with the flag to use a custom profile directory with a preconfigured download path and all the right settings. This can be done through the flag, This could be pretty awesome if it would work. But, WebdriverIO doesn’t accept it.

$ /usr/bin/chromium --user-data-dir=/home/vipulgupta2048/.config/chromium/Profile\ 1

Well, that was all from me. All this information should have helped you download files headlessly into the right paths. If not, I am happy to give it a look if you want me to. Just pop a comment down in the comments with the specific problem you are facing or you can take the community’s help on over on Gitter. Those folks are very helpful, just be sure to be respectful of people’s time while asking questions. Till then, folks live in the mix.

We are on Twitter now! and we do a lot more than writing kickass blogs. Check what really is Mixster!

One last thought,

So, in theory Devtools protocol can’t run on the cloud as I saw somewhere. I maybe wrong. But, what if you can run your code on a server just the size of the palm of your hand. That is right, what I did was started running all my tests locally on my RaspberryPi. It’s extremely easy to deploy with BalenaCloud and you can get started in 5 minutes. See it for yourself here.

0 Comments

Hariprasad says:
13 June 2022 at 2:37 PM
It really helped your detailed steps to resolved file download issue. It worked using ‘devtools’ as a service to download file in my local machine. But to run using browserstack getting the problem. getting browser.cdp is not a command
Loading...
1. Vipul Gupta says:
  13 June 2022 at 3:08 PM
  Thanks for posting your comment. Glad my post could be of some help to you.
  IIRC BrowserStack must be using NightwatchJS that they recently acquired. So I would search for a debugging command to control the browser behavior on runtime when using nightwatchJS + devtools. It won’t be browser.cdp but it could be similar.
  Loading...

A worthy bucket to drop in your thoughts, feedback or rant.Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.