How to Scrape data from indeed.com with php

indeed scraping with php

Indeed is the number one job posting site in the world, with millions of unique visitors every month. This giant job portal features listings posted by recruiters from all around the globe. When creating a job portal, it’s quite easy to populate your site with initial listings by scraping jobs from Indeed.com. Today, I’m going to show you how to scrape data from Indeed.com using PHP. Recently, I developed a XenForo Job Listings addon essentially a mini job portal where users can post jobs by category and region. To populate the database, I scraped job listings from Indeed.com using PHP. I’m now sharing the complete script that successfully did the job. Scrape data from indeed.com with php code Here is the complete PHP script I coded to scrape data from Indeed.com. It has been tested and is being used on several forums without any issues. You’re free to implement and use it for your own needs. Web scraping with PHP is quite easy, but it’s important to understand the structure of the webpage and identify the key data you need to extract. I recommend reading the code explanation provided below to better understand how it works. Scrape data from indeed.com with php code explanation I know I said earlier that scraping data from Indeed.com is very easy—but after looking at the code, you might want to kill me! 😅But seriously, believe me—it is easy once you understand it. So take my hand, and let’s walk through the code together. Overview of the php code: We’ve created a function called fetchJobs inside the IndeedFetch class. This function takes two arguments: the brand (job title) and the location. In my implementation, I’m also using a location ID and brand ID, which you can remove if you’re not planning to store that data in a database. Once we have the location and job title, we need to loop through all the pages that contain job listings related to the given location and brand. For each page, the script extracts individual job titles and descriptions, which can then be fetched and saved into your own database. The script automatically scrapes data from Indeed.com using PHP for every page based on the provided locations and job titles. Step by step explanation: public static function fetchJobs($brands,$locations) { //Loop through each brand foreach ($brands as $brand) { //prepare the brand parameters$brandTitle = $brand[‘brand_title’];$brandId = $brand[‘brand_id’]; //we need to use this brand title format for first page at indeeed.com $brandTitle1 = str_replace(‘ ‘, ‘-‘, $brandTitle); //we need to use this brand title format for other pages $brandTitle = str_replace(‘ ‘, ‘+’, $brandTitle); Ok so here we are getting brands and locations, PLEASE note i am passing brands and locations array with brand and location id’s. You can remove that if you do not want to save data back in the db.brandTitles should be prepared for the URL, In short we are converting “Software Engineer” to ‘software-engineer” because that is how indeed url is formed. brandtitle1 is for first page indeed url and brandTitle variable is used for all other pages. foreach ($locations as $location) { //prepare the location parameters to save in the database$locationTitle = $location[‘name’];$ locationId = $location[‘location_id’]; $locationTitle1 = str_replace(‘ ‘, ‘-‘, $locationTitle); $locationTitle = str_replace(‘ ‘, ‘+’, $locationTitle); //Get jobs for the first page $url = “https://www.indeed.com.pk/q-” . $brandTitle1 . “-l-” . $locationTitle1 . “-jobs.html”; $jobLinks = self::getJobLinks($url); Here, we’re executing another loop that combines each location with a brand to prepare the URL in the correct format. For example: Software Engineer in New York, Software Engineer in Indiana, Software Engineer in New Jersey, and so on. Once this loop finishes processing all locations for the current brand, we move on to the next brand and repeat the process. Note that locationTitle1 and locationTitle are two different parameters—used for the first page and the subsequent pages, respectively. The last line in this block calls another function that returns an array of job links. These links will then be used to fetch the job title and description for each listing. public static function getJobLinks($url) This function includes some typical XPath code, which is easy to understand if you take a moment to look up how XPath works. The structure of an Indeed job listing is fairly straightforward—the job title is contained within an tag with the class “jobtitle”, so we extract the job URL from there. Once the URLs are obtained, they are returned to the parent function. That’s the exact process. You can copy and paste this function directly into your application—I’m confident it will work as expected. for ($i = 10; $i <= 50; $i = $i + 10) { $url = “https://www.indeed.com.pk/jobs?q=” . $brandTitle . “&l=” . $locationTitle . “&start=” . $i; if (is_array($jobLinks)) { $jobLinks = array_merge($jobLinks, self::getJobLinks($url)); } } //end for loop $jobs = array(); foreach ($jobLinks as $jobLink) { $job = self::getJobDetails($jobLink); if ($job) { $jobs[] = array_merge($jobs, $job); } //end this if condition } //end this foreach loop for jobliniks $jobArray[$location]=$jobs; } //end locations loop }//End brands loop return $jobArray; } //end function Notice the for loop—it runs for 10, 20, 30, 40, and 50 because we’re fetching only the latest 50 jobs for each location and brand. You can adjust this range based on your needs. To scrape data from Indeed.com with PHP, simply modify the loop or parameters to match your specific requirements. After that, we call another function—getJobDetail. It’s a simple and self-explanatory function that fetches the details of each job, including the description, title, and post date. If you have some experience with PHP programming, I’m confident you’ll easily understand the logic behind it. Closing words: I personally think this should be enough to help you get started with your next Indeed job scraping project. The code uses simple PHP concepts that shouldn’t take much time to understand, especially if you’ve been coding in PHP for a year or two. If you have any questions, feel free to leave a comment or contact me—I’d

How to take screenshot of a website with PHP

screenshot using php

While programming in PHP, there are many situations where you might need a ready-made function. Capturing a screenshot of a website with PHP is one such task, often required in plugins and custom web applications. Recently, I worked on a project where I needed to capture website screenshots using only the URL of the page. The project involved building a link directory where users could submit their website links along with other relevant information. I had to automatically capture and save a screenshot of each submitted website on the server. Once saved, the screenshot would then be displayed on the web page. There were alternative approaches—such as asking users to upload a screenshot manually or letting them provide an external image URL. However, the most user-friendly solution was to automate the screenshot capture in my desired format. When I looked for a ready-made PHP solution, I was surprised to find there wasn’t a native function or widely used library for this. I kept searching Google with queries like “how to take screenshots of a website using PHP.” Fortunately, I discovered the Google PageSpeed Insights API, which can be used to fetch and save website screenshots with ease. Code to take screenshot of a website using php Here’s what I developed using this amazing API. You’re free to use this code for your own projects or redistribute it as you wish. However, I highly recommend taking the time to read through and understand the details—it’ll help you become a better programmer. In this example, I assume you’ve already created a form that sends the URL via POST to trigger this function. The function accepts the URL and returns the path to the directory where the screenshot is saved. All you need to do is update the directory path accordingly and ensure that the directory is writable. If it doesn’t already exist, the function will create it automatically. If you use this function within your PHP code, it will return the name of the directory where the screenshot is stored. You can then pass this back to your HTML to display the image. It’s that simple and easy. Explanation of Code to take screenshot of a website using php I highly recommend reading this explanation before implementing the solution. It will also help you customize the function to suit your specific needs. This is our call to the Google PageSpeed API to fetch the screenshot and return a JSON-decoded response containing the image data: $data = json_decode($googlePagespeedData, true); $screenshot = $data[‘screenshot’][‘data’]; We decode the response and extract the screenshot data, which we’ll use to save the captured image to our server. This code is used to remove special characters and format the string properly for use. In the final step, we decode the string so it can be used to create the image file. This is one of the most important steps when capturing a website screenshot using PHP. Now that we have the image string, we can use it to display the screenshot in the browser. However, it’s not efficient to call this function every time you want to display the image. A better approach is to save the image once and return its file path. This is where you should modify your code—set the correct filename and directory path. Take a look at the $filename variable: I’m appending a random number to avoid filename conflicts. Make sure to set the directory path appropriately. I’m using the previously mentioned directory for storing the image. In the end, the image data is stored in the $image variable. As you can see, taking website screenshots using PHP is simple and straightforward. if (mkdir($directory) && is_writable($directory)) { imagejpeg($image, $path, 90); return $fileName; } else { return 0; } These lines are used to create a directory at the specified path. If the directory already exists, it should be readable. Make sure to adjust the permissions if you’re working on a Linux system. The function returns the path of image created OR 0 if there was some error. You should be able to track the error and fix that in your case. By now the image should be created in your specified directory. I am a software consultant and experienced programmer. If you encounter any issues implementing this function, feel free to leave a comment I’ll do my best to help you. If you’re a webmaster looking for someone to develop this functionality for you, get in touch, and I’ll assist you with the implementation.

Shallibegin
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.