错误分析:
@Scheduled进行定时任务的时候,spring会创建一个线程,然后用这个线程来执行任务,如果这个任务阻塞了,那么这个任务就会停滞,出现不执行的情况。而使用原生的方法进行http请求时,如果不设置超时时间,定时任务的那个线程就会一直等待对方的响应,就会一直处于运行状态处理该任务,等待下个任务到的时候就没有线程运行了。
正常任务执行完毕后线程的状态应该是TIMED_WAITING,在异常的状态下线程的状态却是RUNNABLE,说明上个任务没有执行完毕,线程一直在执行。
执行完毕后正常状态:
执行完毕后异常状态:
错误代码:
@Component
@Log4j2
public class ScheduleTask {
@Autowired
private SecondUserMapper secondUserMapper;
@Scheduled(cron = "0 0/2 * * * ?") //定时任务注解+cron表达式
public void testScheduleTask() throws IOException, InterruptedException {
List<String> urlList = secondUserMapper.getRandomUrlList();
Thread.sleep(3000);
for (String s : urlList) {
URL url = new URL(s);
// 打开连接
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
// 设置请求方法为GET
connection.setRequestMethod("GET");
// 添加自定义的请求头信息
String agent = secondUserMapper.getRandomAgent();
connection.addRequestProperty("User-Agent", agent);
connection.addRequestProperty("Accept-Language", "en-US,en;q=0.9");
// 获取服务器返回的状态码
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
// 读取服务器返回的数据
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
StringBuilder response = new StringBuilder();
while ((line = reader.readLine()) != null) {
response.append(line);
}
reader.close();
System.out.println("Server Response:" + responseCode);
} else {
System.out.println("Error Code: " + responseCode);
}
// 关闭连接
connection.disconnect();
}
log.info("线程名称为{},线程状态为{}",Thread.currentThread().getName(),Thread.currentThread().getState().name());
}
}
正确代码:
增加超时时间即可
@Component
@Log4j2
public class ScheduleTask {
@Autowired
private SecondUserMapper secondUserMapper;
@Scheduled(cron = "0 0/2 * * * ?") //定时任务注解+cron表达式
public void testScheduleTask() throws IOException, InterruptedException {
List<String> urlList = secondUserMapper.getRandomUrlList();
Thread.sleep(3000);
for (String s : urlList) {
URL url = new URL(s);
// 打开连接
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
// 设置请求方法为GET
connection.setRequestMethod("GET");
connection.setConnectTimeout(100000);
connection.setReadTimeout(100000);
// 添加自定义的请求头信息
String agent = secondUserMapper.getRandomAgent();
connection.addRequestProperty("User-Agent", agent);
connection.addRequestProperty("Accept-Language", "en-US,en;q=0.9");
// 获取服务器返回的状态码
int responseCode = connection.getResponseCode();
if (responseCode == HttpURLConnection.HTTP_OK) {
// 读取服务器返回的数据
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line;
StringBuilder response = new StringBuilder();
while ((line = reader.readLine()) != null) {
response.append(line);
}
reader.close();
System.out.println("Server Response:" + responseCode);
} else {
System.out.println("Error Code: " + responseCode);
}
// 关闭连接
connection.disconnect();
}
log.info("线程名称为{},线程状态为{}",Thread.currentThread().getName(),Thread.currentThread().getState().name());
}
}
错误总结:
我们一般认为线程处于blocked状态的时候线程才是处于阻塞状态,但是这个状态只是对于计算机来说的。对于我们来说,只要业务不执行了,线程就是处于阻塞状态的,因此任何状态下的线程对于业务来说都是阻塞的。
我这个项目是爬虫项目,会去爬取别人网站的数据,有些网站识别爬虫之后不仅会拒绝你访问,还会通过一直不给响应使得你的服务器线程占满,进而导致你的爬虫服务器崩溃。