from : Micro reading https://www.weidianyuedu.com
daily Web In the process of end crawler , We often encounter scenarios where parameters are encrypted , therefore , We need to analyze the web page source code
Pass mode , Layer by layer, the key JS Code , Use Python To execute this code , Get the parameters before and after encryption Python Realization
This article will talk about the use of Python call JS Of 4 Ways of planting
In a simple JS The script, for example , Write the code to a file
//norm.js// Calculate the sum of two numbers function add(num1, num2) { return num1 + num2;}
among , Defines a method , Calculate the sum of two numbers
//py_exec_js_demo.py// Installation dependency pip3 install PyExecJS
then , from JS Read the source code from the file
def js_from_file(file_name): """ Read js file :return: """ with open(file_name, "r", encoding="UTF-8") as file: result = file.read() return result
Last , Use execjs Class compile() Method to compile and load the above JS character string , Returns a context object
import execjsfrom js_code import *# Compile load js character string context1 = execjs.compile(js_from_file("./norm.js"))
Last , Call the of the context object call() Method execution JS Method among them , Parameters include :JS The name of the method whose code is called 、 The incoming parameters of the corresponding method
# call js In code add() Method , Parameter is 2 and 3# Method name :add# Parameters :2 and 3result1 = context1.call("add", 2, 3)print(result1)
Need to pay attention to , because PyExecJS Run locally JS In the environment , It will start before use JS Environmental Science , Eventually, the running speed will be slow. For more functions, please refer to :https://github.com/doloopwhile/PyExecJS
# Install dependency library pip3 install js2py
And then use js2py Medium EvalJs() Method to generate a context object
# Use to get up and down js2py Generate a context context = js2py.EvalJs()
Then use the context object to execute JS Script , Convert to Python Code
# Execute whole paragraph JS Code context.execute(js_content)
Last , Use context to call JS The method in , And make input parameters
# Use context context Call specific functions # Function name :add# Parameters :1,2result = context.add(1, 2)print(result)
It's important to note that , If JS It's a long confusing code , Convert to Python The process may report an error
More functions can be found in :
https://github.com/PiotrDabkowski/Js2Py
It's actually using Python Of os.popen perform node command , perform JS Script
First , Make sure that... Is installed locally Node.js Environmental Science
modify JS Script , Add an export function init , It is convenient for internal functions to be called
// Calculate the sum of two numbers function add(num1, num2) { return num1 + num2;}// Add an export function (node The way )module.exports.init = function (arg1, arg2) { // Call function , And back to console.log(add(arg1, arg2));};
then , Will call JS Method to form a string
# Composition call js The order of # node command :node -ecmd = "node -e "require(\\"%s\\").init(%s,%s)"" % ("./norm", 3, 5)
Last , adopt os.popen Just execute the order
pipeline = os.popen(cmd)# Read the results result = pipeline.read()print(" The result is :", result)
PyV8 yes Google take Chrome V8 lead m.weidianyuedu.com Engine use Python Encapsulated dependency Libraries
It does not rely on local JS Environmental Science , Very fast
import PyV8from js_code import js_from_filewith PyV8.JSContext() as ctx: ctx.eval(js_from_file("./norm.js"))# call js function , Specify the parameters ctx.locals.add(1, 2)
But after repeated tests, it was found that ,MAC and PC stay Python3 In the environment , Use PyV8 Will report all kinds of strange questions , So... Is not recommended !
More functions can be found in :
https://github.com/emmetio/pyv8-binaries
It's summarized above Python call JS Of 4 Ways of planting
In the actual crawler project , It's usually used first node Command a test , After making sure there's no problem , Before reuse 3 In any of these ways Python rewrite .