HOME | Namibia | Africa | China | Asia | | News | Latin America 


Zachary Arcaro ~ Portfolio



As I am currently working on my thesis project,  I have included this report from a related pilot project as it introduces most of the major themes contained in my thesis.   





Evaluating the Use of LIDAR Multiple Return Data to Characterize Forest Canopy Structure in CroatanNational Forest.


Forestry 753

Fall 2006

North Carolina State University

Zachary Arcaro



Vegetation structure is known to be a predictor of animal habitat, but characterizing forest structure is generally limited to field plot sampling. The goal of this project was to evaluate the use of LIDAR in characterizing forest structure. LIDAR (LIght Detection And Ranging) is a remote sensing technique that uses precise spatial location and the two way travel time of laser light pulses to produce a highly accurate representation of the targeted ground area. The recorded reflections of the laser pulses are known as returns and modern LIDAR systems can record multiple returns for each pulse. A single pulse from a LIDAR system could, for example, record a return for the top of a tree, several of its branches and, finally, the ground near the base of the tree. A current common use for LIDAR is the production of elevation data for large areas; in this application, all but the last of these returns, the one representing the ground, is discarded as noise. The aim of this project was to take full advantage of LIDAR data recorded over Croatan National Forest by using the multiple returns to discover and characterize the structure of the forest.


Objectives and Rational

Forests are extremely valuable as habitat for wildlife; furthermore, specific information about a forest can be used to predict how different species will respond (Szaro and Balda, 1979). Characteristics such as what tree species are present and the density of the trees and shrubs are particularly important and can generally be determined using satellite imagery. (Wang, et al, 2004; Conghe and Woodcock, 2003) The objective of this research was to demonstrate the utility of LIDAR multiple return data to analyze another important forest characteristic, the vertical forest structure. In Croatan National Forest, analysis using LIDAR based data was used to locate and derive information from trees within both pocosin pine (Pinus serotina) and longleaf pine (Pinus palustris) forested stands.


Study Area

The area chosen for this project is the Croatan National Forest, located in North Carolina’s coastal plain. The 159,886 acre site is bounded by the Neuse River on the east, by the White Oak River on the west, the Trent River on the north, and is separated from Bogue Sound on the south by state road 24. The site was chosen because it has a variety of forest types (longleaf, Pocosin or pond pine woodlands, and planted loblolly pine), its accessibility for possible future field data collection due to National Forest status, and the availability of LIDAR data coverage of the area.  




The multiple return lidar data was acquired from the USGS Center for LIDAR Information and Knowledge (CLICK) website (http://lidar.cr.usgs.gov/). The data was delivered in (XYZ) ASCII text format. The “bare earth” Lidar data was downloaded from the North Carolina Floodplain Mapping Program website (http://www.ncfloodmaps.com). It was also in ASCII text format.


The LIDAR data in both cases originated from a joint effort between the State of North Carolina and FEMA to use LIDAR to update the states floodplain delineation. This was in response to hurricane Floyd, which struck in 1999 causing up to 6 billion dollars of damage and killing 56 people, most by flood waters. (Smith, 2002) The data used in this project was flown between February and April 2003. The horizontal datum is NAD83 (1995) North Carolina State Plane feet and the vertical datum is NAVD88 US Survey Feet.


The project also used a shapefile (and other associated files) containing forest stand inventory information. This was acquired from the Croatan National Forest. This file contained information including stand age, trees per acre (TPA), forest type and acreage.


Two digital orthophoto quarter quadrangles acquired in 1998 produced by the USGS and downloaded from the NSCU library website were used. They are in the SID format and the spatial reference is NAD 1983 State Plain North Carolina FIPS 3200 (meters).    


Panchromatic imagery with 2 foot resolution was also used in this project for accuracy assessment. This data was downloaded from www.nconemap.com. The data is in the GeoTIFF format and the spatial reference is NAD 83 State Plane North Carolina FIPS 3200 Feet. This imagery was acquired in fall 2003.   


General Approach


Before any analysis could be done, the first step was to convert both the multiple return and bare earth LIDAR data into a projected raster format. The next step was to find the heights of the trees by subtracting the raster derived from bare earth from the raster derived from the multiple return data. The result of this step was a single raster image that gave the height of vegetation above the ground. In addition to preparing the LIDAR data, the specific study sites were selected using the stand maps and DOQQ images. The stand map allowed for the selection of pocosin areas versus longleaf areas and the DOQQ allowed for the selection of stands based on vegetation patterns.   Next, the analysis proceeded with the goal of finding the treetops within the study areas. The method used in this project involved a using a variable size window local maximum function. Varying the size of the window allowed for the fact that canopy size for trees is proportional to their height. The local maximum neighborhood function allowed for the selection of the high points within each window.


Specific Approach


Preparing the LIDAR Data

The work done preparing the LIDAR data was performed using ArcMap. All of the LIDAR data was originally downloaded in ASCII txt format. Because neither Imagine nor ArcMap could natively handle data in this format it was necessary to download and install LIDAR Analyst For ArcGIS from Visual Learning Systems. This program was used to convert the txt file types into LAS files. The LAS file standard was introduced in 2002 for use with lidar data and has replaced proprietary LIDAR file types (Graham, 2005). After conversion to the LAS file type it was necessary to download and install LAS Reader For ArcGIS 9 from GeoCUE, which enabled ArcMap to read the LAS files. The next step involved the use of the “CreateTIN” and “EDIT TIN” tools within ArcMap to create a TIN file from each of the LAS files. The “TIN to Raster” and “Project Raster” tools were then used to change the TIN into a raster and then to correctly define the projection (Lambert Conformal Conic NAD 1983 State Plane North Carolina FIPS 3200 feet). The files were then resampled into one meter grids to both reduce the large file sizes and to introduce metric units. The penultimate step was then to re-project the files into the State Plane North Carolina meters coordinate system to match the existing DOQQ and Stand data. Finally, the raster derived from the bare earth LIDAR data was subtracted from the raster derived from the multiple return LIDAR data. As mentioned above, this step resulted in a single raster image that gave the height of vegetation above the ground. Feet were kept as the units for vertical measurements.


Specific Study Areas

Four study areas were chosen within Croatan National Forest for analysis. This was done because of the limited availability of stand data and the computational limitations associated with the manipulation of LIDAR data covering large areas. Additionally, choosing four limited areas allowed for the selection of very representative plots of the two species under consideration. Two pocosin pine sites were chosen, one with taller trees, called “high” pocosin for the purposes of this study, and a site with shorter or “low” trees. These two conditions are common for species, which grows quite differently depending on the quality of the soil (Kologski, 1977). The two sites in this study both date back to 1920 according to the stand data. The other species under consideration in this study was longleaf pine. Two sites were also chosen to for this species, an “open” stand and a “closed” stand.   These sites were similarly chosen based on common conditions, with a denser or “closed” stands the result of less disturbance (Kologski, 1977) and open stands resulting from more frequent disturbance, in this case active management including prescribed burning.  


Data Analysis

Locating the trees within the study areas was done, as mentioned above, using a variable size window local maximum (LM) function. “The LM technique used with LIDAR data operates on the assumption that the highest laser elevation value among laser hits of the same tree crown is the apex” (Popescu, et al, 2002). Specifically, the local maximum filter looks at the neighborhood around each pixel and assigns each pixel, one by one, a value equal to that of the highest valued pixel within the neighborhood. The LM filter is commonly used with a static window size (a 5×5 or 7×7 matrix, for example), which is not ideal in forest applications because if the window is too big, the small trees are overlooked and if it is too small, the same tree might counted more than once. Or, as Popescu puts it, “if the filter size is too small or too large, errors of commission or, respectively, omission occur” (Popescu, et al, 2002). Thus, in situations where trees with small and large canopies are mixed together, employing a variable sized window function provides better results.

            The next step in the analysis process was to determine what window sizes and shapes were appropriate. The following equation was used to relate tree height (H) and crown width for pine trees:


Crown width = 3.75105 – 0.17919H + 0.01241H2


This equation is based on the assumption that taller trees have larger diameter crowns, and was derived from field inventory (Popescu and Wynne, 2004). In addition, when dealing with pure pine stands, “searching for the LM to identify individual crowns with a circular window of variable diameter is more appropriate than filtering with a square window” (Popescu and Wynne, 2004). Using the equation above it was determined that the following window sizes should be employed:



In this study the variable window size local maximum function was performed using the modeler within ERDAS Imagine. The first step in this process was to input the raster containing the vegetation height into five neighborhood functions of varying size (see table, above) producing five new LM raster files. These five LM raster files, along with the original height raster were then used as input for a conditional statement. This conditional statement function read the value of each pixel in the height raster and then gave it the value of one of the five LM raster files, depending on the original height value. Next, the original height raster along with the product of this last step were then fed into another function which compared the two, cell by cell. If the output from the conditional statement was the same as the original height raster, then that cell qualified as a tree top and was given its height value, if not, that meant the cell did not represent a treetop and was given a value of zero.

            The resulting raster gave the location and height of the treetops within the study area. The final step was to use the shapefiles or the four specific study areas (made in ArcMap), convert them to areas of interest (AOI) and then use Imagine to subset the treetops raster file.



Text of Conditional Statement:


(tree_height <= 43) treeheight_matrix_output_radius_c1,

(tree_height > 43 and tree_height <= 59) treeheight_matrix_output_radius_c15,

(tree_height > 59 and tree_height <= 70) treeheight_matrix_output_radius_c2,

(tree_height > 70 and tree_height <= 79) treeheight_matrix_output_radius_c25,

(tree_height > 79) treeheight_matrix_output_radius_c3,

(tree_height) 200}


Map of Specific Study Areas




The following four charts are the histograms for the analysis results. On the x-axis is the height of the LIDAR returns and on the y-axis is the number or returns for the different heights. There is one graph for each of the four specific study sites and all of the returns have been included. The spikes in the graphs represent a high density of returns from vegetation with similar height.










 The above charts represent the density of the trees in the different specific study areas. The trees have been binned into “small trees,” those trees between 10 and 30 feet, “medium trees,” those trees between 30 and 50 feet, and “tall trees,” those trees taller than 50 feet.


Standard deviation statistics were also calculated for the heights of the returns around the tall trees in the study areas. The method used for this was to employ an annulus shaped matrix to find the standard deviation of the area outside of a three meter radius but within a five meter radius. Tall trees were those with return heights over 55 feet, except in the case of the pocosin “low” area where 30 feet was used due to the lack trees over 55 feet. The results are as follows:



Accuracy Assessment


The accuracy assessment for this project involved comparing the density of trees within the specific study areas to reference data. The standard of trees per acre (TPA) was used, with a tree being defined as vegetation with a height of 10 feet or more.   Two sources of reference data were used: the TPA attribute listed in the forest stand inventory data and a visual inspection of random areas within the specific study sites.

The stand data listed the TPA for each of the stands within the subset of Croatan National Forest. This information was compared to the number of trees per acre (TPA) calculated in the analysis. Two of the specific study sites overlapped more than of one of these stands, in each case, however, the TPA value was the same for each of the intersected stands, so no estimation was needed.

    The second source of reference data was a density measurement created from the visual inspection of panchromatic orthophotographs with two foot spatial resolution. The number of trees inside randomly placed 100 meter diameter circular areas was used to create measurements of tree density. Three circles were used for each of the specific study areas, except of the smallest one, in which case the entirety of the areas was used. In the creation of this reference data “trees” were selected based on the appearance of a discernable crown in the image.

There are several weaknesses in the accuracy assessment that need to be addressed. First, little documentation was available for the forest stand inventory data; therefore the specifics about how it was collected and its accuracy are undetermined. For the purposes of this study, vegetation over 10 feet tall was considered a tree; it isn’t clear what definition was used when researchers or Croatan National Forest staff collected the reference data. Despite this, it was deemed that this data was appropriate for use as a comparison in this project. An additional problem is that the orthophotographs used for visual inspection were panchromatic, making the accurate selection trees more difficult and prone to error. Finally, the density of the trees was the only measurement made. Additional types of reference data were unavailable, however having information about the precise location and height of a reference sample of trees would lead to a more robust measurement of accuracy.  

Because of the nature of this study a traditional error matrix was not created; the results of the accuracy assessment a listed in the table below.








The results of this research demonstrate the ability to analyze forest structure using LIDAR derived data and a variable size window local maximum function. The forest structure can be displayed in graphical form, answering questions about the trends in the heights of the vegetation. The height data can also be displayed on maps, helpfully overlaid on a base map, which is very useful for fast, precise knowledge of the heights of trees.

            The histogram charts shown above have interesting patterns that should be discussed. In the longleaf “open” stand the tree density is low and the heights of the trees are quite evenly distributed throughout the range of heights. In the longleaf closed stand, there is a higher density of trees in each of the three height classes when compared to the open longleaf stand. The pocosin “low” is dominated by small trees and there were no tall trees found. The pocosin “high” had a majority of trees occur within the medium size class. The pattern in the pocosin “high,” with the gradual increase in frequency toward a peak near 40 feet and another at 55 feet and an abrupt transition, may be a result of tall evergreen shrubs, such as loblolly bay (Gordonia lasianthus) being counted along with the pond pine.

            The results of the standard deviation test for the area around the tall trees show that the areas around tall trees were similarly variable, with values around ten. The exception was the pocosin “low” study areas, where the variability was slightly lower at around seven. This type of measurement demonstrates the ability to gather useful statistics about traits of the forest structure other than density.

            While the utility of LIDAR data and this method have proven useful, this research also shows a lot of room for improvement. Compared to the stand data, the method demonstrated in this project greatly undercounted the number of trees. While there are unanswered questions about the reliability of the reference data, reasons for an undercount of trees should be discussed. One possibility is that the formula for the relationship between tree height and crown width is flawed. Some of the stands in the study areas had very tall trees that were quite close together. Because of this density, the trees don’t have enough room to spread out to the width predicted by the formula. The method used has no way of taking this into account and would use a large radius matrix to look for treetops (because of the great height) but would then miss the other tall trees very close by. This could be solved by changing the formula or using different formulas for areas with different densities, determined either by inspection or possibly an embedded preliminary technique for calculating density. Additionally, the undercount of longleaf was much less than that of pocosin. This would indicate that the method might need to be altered based on the type of trees being analyzed, even between two similar conifer species.

            The comparison between the results of the analysis and the reference data derived from visual inspection, on the other hand, is not as clear cut. It implies undercounts in some areas but for the most part shows evidence of over counting the number of trees. Again, this could be due to flawed reference data, in this case two foot resolution panchromatic orthophotographs. In fact, in some instances it was difficult for the analyst to judge between the high reflectance of the bare sandy soil and the high reflectance of the cluster of tree canopies. Assuming, however, that the over-count is a genuine problem, possible reasons should be discussed.

            One possible place to look for solutions to this problem is the cell size of the LIDAR derived data. In the course of preparing the data for analysis it was re-sampled from its original one foot resolution into one meter resolution. This greatly reduced processing requirements and introduced metric units. Re-sampling instead to half meter or even one-third meter resolution could possibly lead to more accurate and precise results. These smaller cell sizes would be useful when determining what matrices to use both because the size of the jumps between consecutive filter sizes would be smaller and also because the matrices could more closely approximate circles.

            A last source of error possibly leading to miscounted trees is the result of the process of converting the LAS file to a TIN. The TIN connects all the points into a continuous surface of triangles. The problem is that the ridges created with the TIN are often leaving artifacts in the data that influence the later analysis. An example of a situation where this would possibly be a problem is two returns near each other representing two trees. When the TIN is created it connects these two returns with a line forming the knife edge of a ridge. After the analysis is complete the two original points were flagged as treetops, however several points along the ridge were also flagged as treetops even though in actuality there is nothing between the two trees. This situation could be rectified by using a different method to create the TIN, or perhaps even by eliminating the TIN altogether and finding a more efficient method for converting the ASCII text into a raster file.



This project was designed to test the feasibility of using LIDAR data to characterize the structure of forested woodlands. The methods used in this project showed promise but need to be refined in order to produce more reliable results in the unique systems studied here. This primarily means customizing the local maximum filter method for characterizing the forested stands, especially the Pocosin stands.  


Recommendations and Applications beyond study area

The techniques used in this project could be used to include other vegetation types within the Croatan National Forest, for example the planted loblolly pine stands. Additionally, the forests types in this forest are common to the whole of coastal North Carolina and thus the techniques developed for use in this study area could be directly applied elsewhere. The ability to fully characterize both Pocosin and longleaf pine stands with respect to understory characteristics is an ultimate goal.   


Future Studies

The structure of forest stands directly affects their suitability to support certain species. After further refinement, techniques developed could potentially be implemented for use in the Southeast Gap Analysis Program (SEGAP) to help improve habitat modeling for animals of the region.



Szaro, R. and R. Balda, 1979. Bird Community Dynamics in a PonderosaPineForest, Cooper Ornithological Society, Lawerence, 73 p.


Wang, L., P. Gong, and G. Biging, 2004. Individual tree-crown delineation and treetop detection in high-spatial-resolution aerial imagery, ISPRS Photogrammetric Engineering &Remote Sensing, 70:351-357


Kock, B., U. Heyder, and H. Welnacker, 2006. Detection of individual tree crowns in airborne lidar data, ISPRS Photogrammetric Engineering &Remote Sensing, 72:357-363


Smith, Brandon R. Floodplain Fliers, North Carolina‘s Massive LIDAR Project. 1 Feb. 2002. 18 Nov. 2006 <http://www.geospatial-online.com/geospatialsolutions/article/articleDetail.jsp?id=8097>.


Graham, Lewis. “The LAS 1.1 Standard.” ASPRS LIDAR Committee (2005)


Kologiski, Russel L. The Phytosociology of the Green Swamp, North Carolina. Tech. Bul No.250: North Carolina Agricultural Experiment Station, 1977.


Popescu, Sorin C., and Randolph H. Wynne. “Seeing the Trees in the Forest: Using Lidar and Multispectral Data Fusion with Local Filtering and Variable Window Size for Estimating Tree Height.” Photogrammetric Engineering & Remote Sensing 70.5 (2004): 589-604.


Popescu, Sorin C., Randolph H. Wynne, and Ross F. Nelson. “Estimating Plot-level Tree Heights with Lidar: Local Filtering with a Canopy-height Based Variable Window Size.” Computers and Electronics in Agriculture 37 (2002): 71-95.

Peace Corps Namibia  |  Teaching English in Dalian, China
AFRICA | Namibia | Botswana | Zambia | South Africa 
ASIA | S. Korea | Hong Kong | China | Vietnam | Cambodia | Laos | Thailand | Malaysia | Singapore
LATIN AMERICA | Panama | Costa Rica | Peru
HOME | Contact Us