[HADOOP-985] Namenode should identify DataNodes as ip:port instead of hostname:port - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.11.0
Fix Version/s: 0.12.0
Component/s: None
Labels:
None

Description

Right now NameNode keeps track of DataNodes with "hostname:port". One proposal is to keep track of datanodes with "ip:port". There are various concerns expressed regd hostnames and ip. Please add your experiences here so that we have better idea on what we should fix etc.

How should be calculate datanode ip:

1) Just like how we calculate hostname currently with "dfs.datanode.dns.interface" and "dfs.datanode.dns.nameserver". So if interface specified wrong, it could report ip like 127.0.0.1 which might or might not be intended.

2) Namenode can use the remove socket address when the datanode registers. Not sure how easy it to get this address in RPC or if this is desirable.

3) Namenode could just resolve the hostname when a datanode registers. It could print of a warning if the resolved ip and reported ip don't match.

One advantage of using IPs is that DFSClient does not need to resolve them when it connects to datanode. This could save few milliseconds for each block. Also, DFSClient should check all its ips to see if a given ip is local or not.

As far I see namenode does not resolve any DNS in normal operations since it does not actively contact datanodes. In that sense not sure if this have any change in Namenode performance.

Thoughts?

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

dfshealth.html
09/Feb/07 21:40
4 kB
Raghu Angadi
HADOOP-985-1.patch
10/Feb/07 01:58
14 kB
Raghu Angadi
HADOOP-985-2.patch
10/Feb/07 02:17
14 kB
Raghu Angadi
HADOOP-985-3.patch
15/Feb/07 23:27
23 kB
Raghu Angadi
HADOOP-985-4.patch
15/Feb/07 23:59
23 kB
Raghu Angadi
HADOOP-985-5.patch
17/Feb/07 00:33
25 kB
Raghu Angadi
HADOOP-985-6.patch
22/Feb/07 00:27
25 kB
Raghu Angadi

Issue Links

incorporates

HADOOP-697 Duplicate data node name calculation in Datanode constructor

Closed

relates to

HADOOP-6867 Using socket address for datanode registry breaks multihoming

Resolved

HADOOP-685 DataNode appears to require DNS name resolution as opposed to direct ip mapping

Closed

Activity

People

Assignee:: Raghu Angadi

Reporter:: Raghu Angadi

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 06/Feb/07 21:25

Updated:: 20/Jul/10 04:52

Resolved:: 22/Feb/07 19:50