Skip to content

Received an UnknownHostException when attempting to interact with a service. See cause for the exact endpoint that is failing to resolve. If this is happening on an endpoint that previously worked, there may be a network connectivity issue or your DNS cache could be storing endpoints for too long.

sh
Oct 30 01:22:14 ubuntu dockerd[2293083]: time="2024-10-30T01:22:14.752476083Z" level=error msg="[resolver] failed to query external DNS server" client-addr="udp:127.0.0.1:59546" dns-server="udp:127.0.0.53:53" error="read udp 127.0.0.1:59546->127.0.0.53:53: i/o
 timeout" question=";sqs.ap-northeast-2.amazonaws.com.\tIN\t A" spanID=0e95ec0f4aa8fcbc traceID=c69346a57036fa48d3850134bb60b134
Oct 30 01:24:37 ubuntu newrelic-infra-service[3023646]: time="2024-10-30T01:24:37Z" level=warning msg="[engine] failed to flush chunk '3024031-1730251471.397479652.flb', retry in 9 seconds: task_id=0, input=tail.9 > output=newrelic.0 (out_id=0)" component=inte
grations.Supervisor output=stderr process=log-forwarder

์œ„ ์˜ค๋ฅ˜๋Š” AWS SDK Java ์˜ SQS ํด๋ผ์ด์–ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ์— ๋“ฑ๋ก๋œ ๋ฉ”์‹œ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด์„œ HTTP ํ†ต์‹ ์„ ์ˆ˜ํ–‰ํ•  ๋•Œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” ์˜ˆ์™ธ ์ƒํ™ฉ์ž…๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ์ž๊ฐ€ ์•Œ์•„์•ผํ•  DNS์™€ ๊ฐ™์ด ๊ฐœ๋ฐœ์ž๊ฐ€ DNS์— ๋Œ€ํ•œ ๊ฐœ๋…์„ ์•Œ๊ณ  ์žˆ์–ด๋„ ์œ„์™€ ๊ฐ™์€ ์ƒํ™ฉ์— ๋Œ€ํ•ด ์›์ธ์„ ์ฐพ๊ณ  ๋น ๋ฅด๊ฒŒ ๋Œ€์ฒ˜ํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”? ๊ทธ๋ฆฌ๊ณ  ์ด ๋„คํŠธ์›Œํฌ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ์ด์œ ๋Š” ๋ฌด์—‡์ผ๊นŒ์š”.

/etc/resolv.conf โ€‹

์šฐ์„  ๋ฆฌ๋ˆ…์Šค์—์„œ๋Š” NetworkManager๋ฅผ ํ†ตํ•ด /etc/resolv.conf ํ†ตํ•ด ๋กœ์ปฌ DNS์™€ ์™ธ๋ถ€ DNS์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ ์‚ฌ๋‚ด ์ปดํ“จํ„ฐ์—๋Š” ๋ผ์šฐํ„ฐ์— ๋Œ€ํ•œ ์•„์ดํ”ผ์™€ Cloudflare(1.1.1.1)์ด DNS ์„œ๋ฒ„๋กœ ์ง€์ •๋˜์–ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

JVM์˜ DNS ์บ์‹ฑ ๊ธฐ๋ณธ๊ฐ’์€ 30์ดˆ โ€‹

The Java virtual machine (JVM) caches DNS name lookups. When the JVM resolves a hostname to an IP address, it caches the IP address for a specified period of time, known as the time-to-live (TTL). Because AWS resources use DNS name entries that occasionally change, we recommend that you configure your JVM with a TTL value of 5 seconds.

AWS SDK Java ์—์„œ๋Š” InetAddress.getAllByName๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ ์ด๋กœ ์ธํ•ด JVM์˜ DNS TTL ์„ค์ •์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‹ค์Œ์€ Amazon Corretto 17์˜ java.security ํŒŒ์ผ์— ๊ธฐ์žฌ๋œ ์ฃผ์„ ์„ค๋ช…์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ, ๊ธฐ๋ณธ์ ์œผ๋กœ๋Š” (Security Manager๋ฅผ ์„ค์ •ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์—) 30์ดˆ ๋™์•ˆ DNS ๊ฒฐ๊ณผ๋ฅผ ์บ์‹ฑํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

properties
#
# The Java-level namelookup cache policy for successful lookups:
#
# any negative value: caching forever
# any positive value: the number of seconds to cache an address for
# zero: do not cache
#
# default value is forever (FOREVER). For security reasons, this
# caching is made forever when a security manager is set. When a security
# manager is not set, the default behavior in this implementation
# is to cache for 30 seconds.
#
# NOTE: setting this to anything other than the default value can have
#       serious security implications. Do not set it unless
#       you are sure you are not exposed to DNS spoofing attack.
#
#networkaddress.cache.ttl=-1

๋”ฐ๋ผ์„œ, ์ •์ƒ์ ์œผ๋กœ ์‹คํ–‰์ค‘์ธ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ ๊ฐ‘์ž๊ธฐ DNS ์š”์ฒญ์ด ์ˆ˜ํ–‰๋˜์—ˆ๋Š”์ง€๋ฅผ ์ดํ•ดํ•  ์ˆ˜ ์žˆ๊ณ , ํ•ด๋‹น ์š”์ฒญ์„ ์ˆ˜ํ–‰ํ•œ ์‹œ์ ์— DNS ์„œ๋ฒ„์—์„œ๋Š” ์š”์ฒญ์— ๋Œ€ํ•œ ์‘๋‹ต์„ ํ•  ์ˆ˜ ์—†์—ˆ๋‹ค๋Š” ๊ฒƒ์„ (failed to query external DNS server ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๋ฅผ ํ†ตํ•ด) ์•Œ ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

DNS ์š”์ฒญ์ด ์‹คํŒจํ•œ ์ด์œ  โ€‹

sh
dig sqs.ap-northeast-2.amazonaws.com

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.amzn2.13.8 <<>> sqs.ap-northeast-2.amazonaws.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45612
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sqs.ap-northeast-2.amazonaws.com. IN   A

;; ANSWER SECTION:
sqs.ap-northeast-2.amazonaws.com. 16 IN A       3.34.228.79

;; Query time: 0 msec
;; SERVER: 192.168.0.2#53(192.168.0.2)
;; WHEN: Sun Nov 03 05:53:46 UTC 2024
;; MSG SIZE  rcvd: 77

dig(๋˜๋Š” nslookup) ๋ช…๋ น์–ด๋ฅผ ํ†ตํ•ด sqs.ap-northeast-2.amazonaws.com์— ๋Œ€ํ•œ DNS ์งˆ์˜๋ฅผ ์ˆ˜ํ–‰ํ•ด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ •์ƒ์ ์ธ ๊ฒฝ์šฐ ์•„๋ž˜์™€ ๊ฐ™์ด UDP๋ฅผ ํ†ตํ•ด DNS ์งˆ์˜์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์•ž์„œ ์˜ค๋ฅ˜์— ๋Œ€ํ•œ ๋ฉ”์‹œ์ง€๋ฅผ ์‚ดํŽด๋ณด๋ฉด DNS ์งˆ์˜์— ๋Œ€ํ•œ ์š”์ฒญ์ด ํƒ€์ž„์•„์›ƒ ๋˜์–ด๋ฒ„๋ ธ์Šต๋‹ˆ๋‹ค. ๋’ค๋Šฆ๊ฒŒ ์•Œ๊ฒŒ๋œ ์ •๋ณด์ด์ง€๋งŒ ์‚ฌ๋‚ด์—์„œ ํ”„๋กœ์ ํŠธ ๊ด€๋ จ ๋‚ด์šฉ์„ ๊ณต์œ ํ•˜๊ธฐ ์œ„ํ•ด ๊ตฌ๊ธ€ ๋“œ๋ผ์ด๋ธŒ์— ์•ฝ 60GB ์ •๋„ ๋˜๋Š” ๋ฌธ์„œ๋ฅผ ์—…๋กœ๋“œ ๋ฐ ๋‹ค์šด๋กœ๋“œ ํ–ˆ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์‹ค ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ž…์žฅ์—์„œ ํฌ๋ฆฌํ‹ฐ์ปฌํ•œ ๋ฌธ์ œ๋Š” ์•„๋‹ˆ๋‹ค โ€‹

AWS SDK๋ฅผ ํ†ตํ•ด SQS ๋ฉ”์‹œ์ง€ ์ฒ˜๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ์ „๋‹ฌ๋œ ์นด์นด์˜ค ์•Œ๋ฆผํ†ก ๋ฉ”์‹œ์ง€์— ๋Œ€ํ•œ ๋ฐœ์†ก ๊ฒฐ๊ณผ๋ฅผ ์ˆ˜์‹ ํ•˜์—ฌ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ, ์ผ์‹œ์ ์œผ๋กœ SQS์— ์ €์žฅ๋œ ์•Œ๋ฆผํ†ก ๊ฒฐ๊ณผ ๋ฉ”์‹œ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜์ง€ ๋ชปํ•˜๋”๋ผ๋„ (์ง€์†์ ์œผ๋กœ SQS ํ†ต์‹ ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†๋Š” ์ƒํƒœ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด) ํฌ๋ฆฌํ‹ฐ์ปฌํ•œ ๋ฌธ์ œ๋Š” ์•„๋‹™๋‹ˆ๋‹ค. ๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  DNS ์˜ค๋ฅ˜์— ๋Œ€ํ•œ ์•Œ๋ฆผ์„ ํ™•์ธํ•˜๊ณ  ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๊ธฐ๋Šฅ์— ๋Œ€ํ•ด ์ฃผ๊ธฐ์ ์ธ ๋ชจ๋‹ˆํ„ฐ๋ง์€ ํ•„์š”ํ•œ ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค.

์•„๋ฌดํŠผ ํ•ดํ”„๋‹!...

Released under the MIT License.