Implementation of a double scattering nozzle for Monte Carlo recalculation of proton plans with variable relative biological effectiveness

A constant relative biological effectiveness (RBE) of 1.1 is currently used in clinical proton therapy. However, the RBE varies with factors such as dose level, linear energy transfer (LET) and tissue type. Multiple RBE models have been developed to account for this biological variation. To enable recalculation of patients treated with double scattering (DS) proton therapy, including LET and variable RBE, we implemented and commissioned a Monte Carlo (MC) model of a DS treatment nozzle. The main components from the IBA nozzle were implemented in the FLUKA MC code. We calibrated and verified the following entities to experimental measurements: range of pristine Bragg peaks (PBPs) and spread-out Bragg peaks (SOBPs), energy spread, lateral profiles, compensator range degradation, and absolute dose. We recalculated two patients with different field setups, comparing FLUKA vs. treatment planning system (TPS) dose, also obtaining LET and variable RBE doses. We achieved good agreement between FLUKA and measurements. The range differences between FLUKA and measurements were for the PBPs within ±0.9 mm (83% ⩽ 0.5 mm), and for SOBPs ±1.6 mm (82% ⩽ 0.5 mm). The differences in modulation widths were below 5 mm (79% ⩽ 2 mm). The differences in the distal dose fall off (D80%–D20%) were below 0.5 mm for all PBPs and the lateral penumbras diverged from measurements by less than 1 mm. The mean dose difference (RBE = 1.1) in the target between the TPS and FLUKA were below 0.4% in a three-field plan and below 1.4% in a four-field plan. A dose increase of 9.9% and 7.2% occurred when using variable RBE for the two patients, respectively. We presented a method to recalculate DS proton plans in the FLUKA MC code. The implementation was used to obtain LET and variable RBE dose and can be used for investigating variable RBE for previously treated patients.


Introduction
The majority of long-term follow-up data from proton therapy are from patients treated with double scattering (DS) proton therapy. This delivery technique employs scattering and range shifting components to spread and confirm the beam. One of the benefits of proton therapy vs. conventional radiotherapy with photons is the reduction of dose to healthy tissue (Lomax et al 1999, MacDonald et al 2008. To relate clinical photon data to protons, current clinical practice considers protons to be uniformly 10% more effective than photons, characterized by a relative biological effectiveness (RBE) of 1.1. However, in vitro experiments have shown that the RBE of protons vary with the linear energy transfer (LET), dose level, tissue fractionation sensitivity (α/β ratio) and clinical endpoint (Paganetti 2014). A particular increase in RBE towards the end of the proton beam range due to elevated LET has also been suggested by in vivo experiments (Saager et al 2018). This increase could be problematic for patients with organs at risk (OARs) located near the distal end of treatment fields. Moreover, recent studies have indicated a possible correlation between the increased LET and asymptomatic magnetic resonance (MR) image changes (Bahn et al 2020, Eulitz et al 2019b, Peeler et al 2016, Underwood et al 2018. Still, further investigations are required to understand how and if elevation in LET/RBE influence the toxicity in OARs.
Multiple models have been developed to account for the variable RBE, both phenomenological models (Rørvik et al 2018), and biophysical models such as the local effect model (LEM) (Scholz et al 1997, Elsässer et al 2010 and the microdosimetric-kinetic model (MKM) (Hawkins 1998). To enable correlation studies between the RBE variation and follow-up data, the LET distributions must be obtainable in combination with the physical dose for previously treated patients. Calculation of variable RBE dose in commercial treatment planning systems (TPSs) involved in clinical practice is under development. This advancement will increase the need of Monte Carlo (MC) codes for comparison and verification purposes. Furthermore, MC codes have been shown to be more accurate, especially in heterogeneous regions (Paganetti 2012. To recalculate DS proton therapy plans in an MC code, the geometry within the particular treatment nozzle must be implemented and undergo verification and commissioning, ideally to experimental measurements acquired at the specific beam delivery unit. DS proton therapy nozzles have previously been implemented in MC codes such as TOPAS (Perl et al 2012, Testa et al 2013, Shin et al 2017, Liu et al 2019, Geant4 (Paganetti et al 2004(Paganetti et al , 2008 and MCNPX (Sayah et al 2013). The FLUKA MC code (Battistoni et al 2016, Ferrari et al 2005, Böhlen et al 2014 have earlier been applied in recalculation of pencil beam scanning proton therapy plans (Parodi et al 2012, Giovannini et al 2016, Fjaera et al 2017 and carbon ion treatment plans , Bauer et al 2014, where the necessary nozzle geometry to implement is substantially reduced compared with DS nozzles.
The purpose of this study was to develop a system that enables MC recalculation of patients treated with DS proton therapy, specifically aiming towards LET and variable RBE calculation with phenomenological models. The DS nozzle geometry was implemented into the MC code and was commissioned and verified to experimental measurements.

Materials and methods
2.1. The double scattering nozzle DS proton therapy starts with an initial pencil beam that traverses various nozzle components in order to spread and confirm the beam both in the lateral and longitudinal directions to cover the entire target volume with dose. At the University of Florida Health Proton Therapy Institute (UFHPTI), a total of three DS nozzles from IBA (Ion Beam Applications, Belgium) are installed. The main components of each nozzle are (in sequence from upstream towards the patient isocenter): first ionization chamber, first scatterers (FS), range modulator (RM) wheels, second scatterers, variable collimators, second ionization chamber, field mirror, snouts and lastly patient specific apertures and range compensators. The initial beam is accelerated by a cyclotron and is steered into the nozzle by powerful magnets. The beam energy directly prior to entering the nozzle is known, but the beam size, shape, divergence, and energy spread at the nozzle entrance is typically derived from models. Typical values for the initial beam at the UFHPTI are an angular divergence of 3 mrad and a Gaussian beam spot size in x and y of 1.41 cm at FWHM. The parameter which affects the beam the most is the energy spread, which we determined during the nozzle commissioning in the MC software.

Geometry implementation
The geometry of the most central components of the IBA nozzle at the UFHPTI was implemented into the FLUKA MC code based on vendor blueprints. FLUKA has a powerful graphical user interface called Flair (Vlachoudis 2009) which simplifies geometry definition and provides the ability to visualize the geometry during creation. Most of the components mentioned above were relatively straightforward to implement, using basic structures such as rectangular parallelepipeds (RPPs), cylinders, cones, and planes. However, more complex components required advanced solutions for implementation. The geometrical implementation of these components is described in detail below.

Range modulator wheels
The IBA nozzle contains three different RM wheels, where each wheel has three separate tracks (figure 1) defined to specify spread-out Bragg peaks (SOBPs) of various ranges and modulation widths. An IBA RM wheel track has steps with varying thickness, where each step defines a single pristine Bragg peak (PBP). The number of steps on a wheel track range between 14 and 31, depending on the track. When the wheel rotates during beam irradiation, the beam sequentially penetrates the different steps which ultimately results in an SOBP. Each step is made of either Lexan (polycarbonate) or carbon, and these low-Z materials are combined with steps of lead (high atomic number) in order to provide a constant energy loss while maintaining the Figure 1. Illustration of the upstream part of a range modulator wheel consisting of three tracks. The golden sectors are the brass stop blocks, while each blue sector is a step made of Lexan where darker blue illustrates an increase in step thickness. The beam is not necessarily turned on (i.e. the start angle) at the zero angle (red dashed line). The correct start angle must be identified (green dashed line), in order to have the beam stop at the correct position on the wheel (stop angle at green dashed line). Note also that the beam has a certain size and may therefore stop while overlapping multiple consecutive steps. scattering power. Every step has different water equivalent thicknesses and angular widths, defining the PBP pull-back and weight, respectively. A brass stop block is also used at the boundary between the thickest and thinnest step to ensure that the beam does not penetrate these adjacent steps simultaneously.
During treatment, the beam will be turned on at the start position (i.e. start angle) of the wheel and begin rotating. The requested modulation width of the SOBP decides the so-called stop angle, which is the position on the wheel where the beam is turned off. The wheel subsequently rotates back to the start position, followed by the beam being turned on again, where a complete rotation lasts 100 ms. While this results in an SOBP, it is not necessarily flat. Thus, in order to maintain a flat SOBP for different initial beam energies and a limited number of RM wheels, beam current modulation (BCM) is applied at the cyclotron to regulate the beam intensity (Lu and Kooy 2006). There are 24 BCM files associated with each treatment nozzle. The BCM file contains 255 weights that specify the beam intensity, used to fine-tune the flatness of the SOBPs. Each BCM weight is applied at a specific angular position on the wheel track, covering an angle span of 360 • /255 = 1.41 • .
In FLUKA, we implemented each RM track as a cylinder, where the steps were separated using infinite planes. In order to create an SOBP, the wheel has to rotate in real-time, or the rotation must be modelled by using probability as described below. To model a rotating wheel and to apply the BCM we implemented the following algorithm in the FLUKA source user routine: For each primary particle (PP), a weighted random sampling method is used, where the probability to sample a specific BCM weight is decided by its respective weight. The angle span of the sampled BCM weight is read out, and subsequently, if the angle span is either fully or partly within the start and stop angles of the wheel, a random angle within the BCM angle span is sampled. If this sampled angle is still within the start and stop angles, the wheel is rotated by number of degrees equal to the sampled angle and the PP is simulated. If the angle span, or sampled angle, is not within the start and stop angles, an entirely new BCM weight is sampled, and the process is repeated. After a sufficient number of angles are sampled, and subsequently number of PPs simulated, this method will perform as a real-time rotating wheel. This solution also considers that the beam spot can be positioned between multiple consecutive steps.

Apertures
The treatment field specific aperture is used to conform the dose in the lateral direction. The shape of the aperture opening (defined by coordinates located in the DICOM RT Plan file) is dependent on the target volume and is typically shaped as an arbitrary polygon. FLUKA does not natively support defining polygons in the geometry; therefore, we had to approximate the opening by other means. To define the aperture opening in FLUKA, we used RPPs with a base area of 1 × 1 mm 2 . Since a smaller base area would lead to an increase in CPU time per patient, the base area was chosen as a compromise between maintaining particle tracking efficiency and keeping a low uncertainty in the definition of the aperture opening. To increase the efficiency further, the RPPs were restricted to the boundary. The center of the opening was defined using a cylinder with a radius slightly smaller than the closest distance between center and aperture opening boundary. Thus, we could approximate the opening using roughly between 500 and 3300 RPPs, highly dependent on the shape and size (figure 2(a)).
The coordinates of the aperture opening, (x,y) iso , are defined in the isocenter plane. As the aperture itself is located upstream of the isocenter, the coordinates must be projected onto the correct plane. Using the virtual source axis distance (VSAD) and the isocenter to block tray distance (IBTD) (both located in the DICOM RT Plan file), we defined the projected coordinates by the following equation: (1)

Range compensators
The range compensator confirms the dose to the distal end of the target. It is typically milled from a block made of Lucite (polymethyl methacrylate). The DICOM RT plan file contains information on the drill bit diameter, the position of each hole as well as the depth of the drill hole.
To create the range compensator in FLUKA, we defined the outer part as a cylinder made of Lucite. The drill holes were created by defining partially overlapping cylinders with the same diameter as the drill bit, while the heights of the cylinders were defined equal to the drill depths in the DICOM file (figures 2(b) and (c)).
As with the aperture, the drill positions are defined in the isocenter plane following the DICOM standard. Therefore, the same coordinate transformation as for the aperture had to be conducted, however, exchanging the IBTD by the isocenter to compensator distance also obtained from the DICOM file.

Nozzle commissioning
The IBA universal nozzle is capable of creating treatment fields with ranges between 4.6 and 28.4 cm in water by using beam energies between 155 and 230 MeV. The range interval is divided into eight smaller spans denoted as options, where each option is labeled B1 to B8. Each option consists of a specific combination of second scatterer and RM wheel track working in conjunction with the FS to create a flat beam profile with a specified range and modulation width. Each option is further divided into suboptions where the only parameters changing within the suboptions are the initial beam energy (i.e. range), modulation width and FS setting. Furthermore, each suboption is applied a specific BCM. Verification and calibration should therefore be done for representative beams for each suboption.
All simulations were conducted in FLUKA version 2011.2x using the HADROTHErapy default settings, with particle transport and delta ray production thresholds set to 100 keV (neutrons: 10 −5 eV). Considering that all measurements had been conducted in water phantoms and that the TPS calculates the dose to water, the FLUKA doses were either directly obtained in water or converted to dose to water for other materials. The depth dose curves for the PBPs and the SOBPs were measured using an IBA PPC05 parallel plate ionization chamber with a collecting diameter of 1 cm placed in an IBA Blue Phantom water phantom. The gantry angle was set to 0 • , to avoid irradiation through the phantom wall. The PBPs were created with the RM wheel in a fixed position, irradiating through the thinnest step. The lateral dose profiles were measured in an IBA Blue Phantom using an IBA EFD3G electron diode with an active area diameter of 2 mm and a thickness of 0.06 mm. Relevant details regarding the measurement setup such as air gaps and aperture openings are specified for each calibrated entity. Such details from the measurement setup were set equal in the simulation setup. The following sections describe the parameters we calibrated and/or verified to experimental measurements.

Range calibration
The initial parameter we calibrated was the range of the PBPs. During the range calibration, the energy spread had not yet been determined. Therefore, we used a fixed energy spread of 0.9 MeV as an approximate value. As a result, we chose to calibrate the range for the PBPs at the 80% distal dose (DD) level, since the range at the 80% DD is independent of the energy spread (Gottschalk 2011).
For every option, we selected three PBPs with ranges covered by each suboption. We adjusted the nozzle settings in FLUKA to correspond to the same setup used for the respective experimental measurements in water phantoms. The simulated ranges of the PBPs are prone to uncertainties depending on the applied normalization. To minimize these uncertainties, we normalized the PBPs, for both FLUKA and measurements, using a theoretical representation of the PBPs called the Bortfeld fit (Bortfeld 1997).
After the simulations, the range difference between FLUKA and measurements at the 80% DD were obtained. In order to adjust the FLUKA ranges in accordance with measurements, we tuned the initial beam energy. To find the new FLUKA energy, we rearranged the Bortfeld energy-range relation (Bortfeld and Schlegel 1996) to the following equation: where E N and E O are the beam energies applied in FLUKA post-and pre-adjustment, respectively. ∆R is the range difference between FLUKA and measurements at the 80% DD while α = 0.0022 cm MeV −1 and p = 1.77 are coefficients. Furthermore, we verified the calibration by simulating all the PBPs with the new energies and ensured that we achieved a range agreement to measurements within 0.5 mm.
To apply a systematic calibration for patient recalculations (with vastly different combinations of nozzle setups), we decided to use a fixed range calibration that was option dependent. Within an option, the nozzle geometry is permanent except for the different FS settings. Since the FS setting can change within a suboption, using a suboption-dependent range calibration does not induce increased accuracy. Thus, to calibrate the ranges in FLUKA, we calculated the mean energy adjustment for each option, hereafter referred to as the mean option energy adjustment, E A , using the following equation: where E O,s is the original energy for the PBP in suboption, s, E N,s is the new energy obtained from equation (2) for the same PBP, and 3 describes the total number of suboptions within an option, i.e. the number of simulated PBPs per option. We further verified the accuracy of the range calibration and aimed for an agreement between FLUKA and measurements within 1 mm. For all the PBPs, the air gap was set to 10 cm and we used an aperture opening of 15 × 15 cm 2 without a range compensator. We simulated 15 million PPs and the scoring region was defined as a cylinder with a diameter of 2 cm and a resolution of 0.5 mm in the longitudinal direction.

Energy spread calibration
The energy spread is dependent on the initial beam energy and determines the slope of the distal dose fall-off (DDF) for SOBPs and PBPs. The widths of the PBPs are also regulated by the energy spread. To calibrate the energy spread, we recalculated a total of 14 PBPs spanning energies between 155 and 226 MeV in intervals between 5 and 15 MeV.
To compare the FLUKA PBPs to measurements, we evaluated the millimeter differences at multiple dose levels and regions: • DD at the 90% to 50% dose levels in 10% intervals, • DD at the 20% dose level, • proximal dose (PD) at the 90% to 50% dose level in 10% intervals, • total width difference (distal-proximal) at the 90% to 50% dose level in 10% intervals, • DDF, i.e. the difference between the DD at the 80% and 20% dose levels.
The goal was to keep all differences between FLUKA and measurements below 0.5 mm, with the highest priority at the DD and DDF. The simulations and resulting comparisons were done on a trial-and-error basis. We started with an energy spread of 0.9 Mev. The simulated PBPs were compared with the measurements and re-simulated with either a higher or lower energy spread until the PBP met the goals. The identified energy spreads for the 14 PBPs were plotted against the initial beam energy in order to create an energy spread calibration curve to be used for subsequent simulations. The simulation setup was identical to the range calibration setup.

SOBP verification and start angle calibration
A specific combination of range and modulation width for an SOBP determines which option, thus which RM wheel track and second scatterer to be used. Each RM wheel track has a specific start angle (a position on the track where the beam is turned on) which is located somewhere on the brass stop block and typically differs from the zero angle of the wheel track (figure 1). This angle is not provided by the vendor and must be correctly identified. The stop angle, i.e. where the beam is turned off, determines the modulation width (defined here as the distance between the distal and proximal 90% dose). Furthermore, the stop angle is relative to the start angle and not the zero angle for the wheel track. Thus, an incorrect start angle may result in a modulation width that is either too long or too short. It could also result in a shift of the BCM weights, leading to incorrect weighting of the PBPs and subsequently either non-flat SOBPs and/or shifts in the measured oscillations in the plateau region.
To identify the start angles of the wheel tracks, we compared the modulation widths from simulations to corresponding measurements. We initially started with a start angle defined at the RM wheel zero angle and adjusted the start angle until the modulation width as well as the BCM corresponded to measurements. We simultaneously evaluated the range differences of the SOBPs. We aimed for range differences between FLUKA and measurements 1 mm and modulation width differences below 2 mm.
The start angles were obtained on trial-and-error basis; thus, 20 million PPs histories were initially simulated to reduce computational time. The energies and energy spreads used were obtained from the calibration, i.e. using the mean option energy adjustment as well as the energy spread curve. When the RM wheel start angle provided SOBPs that corresponded to the measurements, we verified the start angle by increasing the number of PPs to 100 million to ensure sufficient statistical accuracy. We used the same scoring definition as for the range calibration and the aperture opening was set to 15 × 15 cm 2 and the air gap to 10 cm. The SOBPs were normalized to the mean dose between the center of the SOBP plateau and ±25% of the modulation width.

Aperture and range compensator verification
We also verified the lateral dose profiles collimated by brass apertures. We simulated an SOBP with a range of 15.1 cm and a modulation width of 10.4 cm. The field was collimated using three different apertures with openings of 3 × 3 cm 2 , 6 × 6 cm 2 , and 10 × 10 cm 2 . We evaluated the differences between the measurements and the simulations at the 90% to 10% dose level in 10% intervals, in addition to the 80%-20% lateral penumbras.
Measurements of the lateral dose profiles had been taken at three different depths; 0.5 cm, 9.9 cm, and 14.1 cm. We scored the dose from the profiles in FLUKA at the same depths, with a longitudinal bin size of 1 mm. The bin size in the lateral direction was 0.5 mm and we simulated 200 million PPs. The air gap was set to 10 cm and the source surface distance (SSD) to 220 cm, identical to the measurement setup. The doses from the simulated and measured profiles were normalized to the mean dose between ±0.5 cm from the central axis.
In order to verify the range compensator implementation in FLUKA, we simulated an SOBP penetrating a stair-shaped range compensator and compared with measurements as well as the Eclipse TPS (Varian Medical Systems, Palo Alto, CA, USA). The range compensator consisted of Lucite with a density of 1.2 g cm −3 and three different thicknesses of 5.3 cm, 2.7 cm, and 0.1 cm. The SOBP had a range of 15.1 cm and a modulation width of 9.1 cm. We compared the dose in both the lateral and longitudinal directions at different depths and offsets from the central axis. The doses from the simulations, measurements and the TPS were normalized using the same methods as described for the apertures and SOBPs. Furthermore, we compared the 2D dose distributions from FLUKA and the TPS using the stair-shaped range compensator. The dose from the FLUKA simulation was normalized according to the absolute dose calibration described in the next section. The radius of the circular aperture opening was 6 cm. We simulated 300 million PPs and scored the dose in a 360 × 360 × 165 grid with voxel sizes of 0.5 × 0.5 × 1 mm 3 , the latter being in the longitudinal direction.

Absolute dose calibration
The output factor describes the relationship between the delivered dose and the treatment machine output. In proton therapy, the machine output is specified in monitor units (MU), where a MU corresponds to a certain amount of collected charge in the ionization chamber within the nozzle. At the UFHPTI, an analytical model is used to predict the output factor for patient treatment fields in order to reduce the number of calibration measurements (Kooy et al 2003(Kooy et al , 2005.
Monte Carlo simulations typically express the dose calculation in the form of relative dose, i.e. the dose per PP. However, it is usually beneficial to have the MC tool represent the dose in absolute units; thus, we implemented a method to obtain the relationship between the number of PPs and the delivered dose.
The ratio between the number of PPs and the delivered dose (PP ratio) is dependent on the nozzle setup, i.e. the treatment option, similar to the output factor (Kooy et al 2003). The PP ratio within a treatment option is also dependent on both the range and modulation width. By dividing the entire range span available for the treatment nozzle (4.6-28.4 cm) into small intervals, we can have the PP ratio as functions of only modulation width. Therefore, we obtained the relationship between the modulation width and the PP ratio for each suboption.
For every suboption we selected six SOBPs with varying modulation widths. Prior to the simulations, the water phantom in FLUKA was moved such that the center of the plateau of the SOBP was placed in the isocenter. The SOBP was simulated in FLUKA and we obtained the relative dose at the isocenter (expressed as Gy/PP). By taking the inverse of the relative dose, we obtained the PP ratio (PP/Gy) for the specific SOBP. This process was repeated for each of the six SOBPs within the suboption. We fitted curves describing the relationships between the modulation widths and PP/Gy for each suboption. By inputting the delivered dose from the treatment field information, we could thus calculate the number of PPs to normalize the FLUKA output in order to have the dose in absolute units. All SOBPs were simulated using 200 million PPs.

Patient recalculations
A final goal of the nozzle implementation is to obtain LET and variable RBE dose from proton therapy plans to further enable correlation studies of follow-up data from patients treated with DS proton therapy. Therefore, we wanted to compare the nozzle implementation and commissioning to the already commissioned Eclipse TPS at the UFHPTI.
We selected two anonymous pediatric posterior fossa tumor patients for recalculation. Both patients received total doses of 54 Gy(RBE) to the PTV using three fields for patient 1 and four fields for patient 2. In addition, patient 2 had experienced symptomatic brainstem injury. The patient CT images were converted to FLUKA voxel files by using a built-in function in Flair. The Hounsfield units (HUs) from the CT images were converted to material composition by the conventions of (Schneider et al 2000) and (Parodi et al 2007).
TPSs typically use the material density as well as the relative stopping power to characterize the radiation energy loss in different tissues for photons and protons, respectively. The material densities and stopping powers are related to the HUs acquired by the CT scanner. Thus, in order to have an unbiased comparison between two different dose calculation systems, the same calibration curves must be used. Therefore, the HU to material density curve was directly imported into FLUKA. To implement the HU to relative stopping power curve, we simulated PBPs in a total of 85 phantoms with different HUs ranging between −974 and 3071. The range at the 80% DD were read out and compared with the estimated ranges from the stopping power calibration curve. Similar to the range calibration, we used Bortfeld fits for normalization of the PBPs. We applied density correction factors to the HUs in the FLUKA voxel file such that the ranges would correspond to the stopping power curve and re-simulated the PBPs in order to verify the range adjustments. We scored doses in a 2 cm diameter cylinder using a 0.5 mm longitudinal resolution and simulated each PBP using 15 million PPs.
By convention, commercial TPSs calculate the dose to water (D w ) as opposed to MC codes which by default calculates the dose to material. Thus, in order to directly compare the two systems, the D w must also be calculated for the MC code. The D w in each voxel was obtained using: where Φ is the fluence for all particles in each voxel and LET d is the dose-averaged LET in water. The LET d calculation was implemented in our MC workflow in a previous study (Fjaera et al 2017). The LET d in water in each voxel was calculated using the following equation: where Φ i (E) is the fluence of particle type, i, with kinetic energy, E, LET i (E) is the unrestricted LET for the same kinetic particle, i, and ρ w is the density of water (1 g cm −3 ). We scored the RBE = 1.1 dose and LET d on the same grid as for the TPS and simulated the patients using a total of 300 million PPs per treatment field yielding a mean statistical uncertainty below 1% in the target region. The LET d was scored for primary and secondary protons. We also evaluated the agreement between FLUKA and the TPS using gamma tests.
Furthermore, our group has previously implemented and analyzed all published phenomenological RBE models (per November 2017) in our FLUKA MC system (Rørvik et al 2018). The RBE in each voxel can be expressed by: where D p is the physical dose from protons per treatment fraction, α x and β x are parameters in the linear-quadratic model for the reference photon radiation, and RBE max and RBE min describe the RBE at the lower and upper dose limits, respectively. By employing equation (6), we calculated the variable RBE-weighted dose in each voxel using the McNamara model (MCN) (McNamara et al 2015) including the RBE distribution. The RBE max and RBE min for the MCN model were calculated using: RBE min For the variable RBE calculations, an (α/β) x ratio of 2.1 was used (Meeks et al 2000). Figure 3 shows the completed IBA universal nozzle implemented in FLUKA. Using in-house developed scripts, the nozzle setup, as well as the patient position and scoring grid, can be automatically defined for each patient.

Range calibration
The mean option energy adjustments ranged between −0.92 MeV and 0.73 MeV (appendix table A1). Prior to tuning the initial beam energies in FLUKA, the difference in PBP ranges between FLUKA and measurements could reach up to 2 mm (figure 4). By adjusting the FLUKA beam energies using the mean option energy adjustment in equation (3), we achieved range differences for all options within ±0.9 mm, where the largest range differences occurred for the high range beams. For reference, the figure also shows the range difference when adjusting the energy specific to each PBP obtained from equation (2).

Energy spread calibration
The simulated PBPs were in good agreement with measurements after using the correctly identified energy spreads ( figure 5(a)). The relationship between the initial beam energy and energy spread was obtained from the 14 simulated PBPs (figure 5(b) and appendix table A2). Comparing to measurements, we achieved differences in DDF below 0.5 mm, and the differences between the 90% DD, 20% DD, and 90% PD were below 0.2 mm, 0.2 mm, and 0.5 mm for all PBPs, respectively (appendix table A3).  . The range differences between PBPs from FLUKA and measurements (FLUKA-measurement) after the mean option energy adjustment (green dots). The range differences before energy adjustments (blue triangles) and after PBP specific energy adjustment (red squares) are also shown for reference. The vertical lines divide the range spans covered by each nozzle option.

SOBP verification and start angle calibration
After the RM wheel start angles had been identified, the SOBPs showed good agreement between the measurements and FLUKA, particularly in the SOBP plateau regions and the distal parts (figure 6(a) and appendix table A4). The oscillations in the SOBP plateaus were also correctly modelled in FLUKA, confirming a good agreement between the RM wheel start angle and the BCM. Overall, the ranges for most of the SOBPs we verified were within ±1 mm of the measurements, with only 2 out of 28 SOBPs having range differences up to 1.6 mm ( figure 6(b)). The differences in modulation widths were below 2 mm in most cases. However, some SOBPs had larger differences up to 5 mm (figure 6(c)).

Aperture and range compensator verification
The lateral dose profiles from the aperture simulations showed good agreement with the measurements ( figure 7). For all evaluated aperture openings and depths, we achieved differences between FLUKA and measurements for the 80%-20% lateral penumbras below 1 mm (appendix table A5). FLUKA did, however, typically show slightly sharper penumbras compared with the measurements. Using the stair-shaped range compensator, the lateral dose profiles from FLUKA matched the measurements very well (figure 8). The dose oscillations resulting from multiple Coulomb scattering at the range shifter boundaries were also reproduced by FLUKA at depths of 3.7 cm, 6.7 cm and 9.7 cm (figures 8(a)-(c)), while the TPS predicted the dose less accurately in these areas. The pullback in the longitudinal dose profiles and the flatness of the SOBPs were in good agreement with measurements with range differences at the 90% DD below 1 mm (figures 8(e) and (f)).
The differences in the dose were minor when comparing 2D dose distributions between FLUKA and the TPS. However, there were some inconsistencies at the location of the dose oscillations/range shifter boundaries (figure 9), also visible in the lateral profiles (figure 8).

Absolute dose and patient recalculation
By simulating six SOBPs for each suboption within a nozzle option, we were able to predict the relationship between the number of primary particles and dose as functions of the modulation width (figure 10 and appendix table A6). The FLUKA recalculated RBE = 1.1 dose distribution for patient 1 was in very good agreement with the original TPS dose ( figure 11). There were no substantial range differences and minor lateral dose differences. The mean dose to the PTV was 54.7 Gy(RBE) in the TPS and 54.9 Gy(RBE) in FLUKA. The brainstem mean dose was also 0.2 Gy(RBE) higher in FLUKA, compared with a TPS dose of 49.5 Gy(RBE) (figure 12(a) and appendix table A7). The difference in the maximum dose (D 0.1cc ) between the two dose calculation systems was 0.3 Gy(RBE) for both the PTV and the brainstem. We obtained 3%/3 mm gamma pass rates of 100% in the PTV and 98.8% in the entire brain. The variable RBE-weighted mean dose from the MCN model was, as expected, higher compared with the RBE = 1.1 doses from FLUKA and the TPS dose, increasing to 60.1 Gy(RBE) in the PTV and 54.2 Gy(RBE) in the brainstem. The LET d increased at the distal part of the treatment fields leading to a high LET d edge (figure 11(e)), and a mean LET d in the brainstem of 2.3 keV µm −1 . This further resulted in visible increase in RBE from the MCN model in the same area with a mean RBE in the brainstem of 1.2 ( figure 12(b)).  For patient 2, there was a difference of 1.4% in the mean dose to the PTV. Using FLUKA we found a mean dose to the PTV of 54.7 Gy(RBE) and 35.5 Gy(RBE) to the brainstem while the TPS dose was 55.5 Gy(RBE) to the PTV and 35.4 Gy(RBE) to the brainstem (figure 12(a) and appendix table A7). The D 0.1cc was lower by 2.4% and 2.0% in FLUKA compared with the TPS in the PTV and brainstem, respectively. The gamma tests yielded 3%/3 mm pass rates of 92.2% in the PTV and 90.8% in the brain. The mean LET d in the brainstem was 4.4 keV µm −1 , resulting in a mean RBE of 1.4 from the MCN model ( figure 12(b)).

Discussion
In this paper, we presented the first implementation of an IBA DS proton therapy nozzle in the FLUKA MC code. Ranges and distal dose fall-offs of pristine Bragg peaks and SOBPs were calibrated with millimeter precision and lateral penumbras were verified with good accuracy. Implementation of patient-specific components was successfully verified, and we also obtained LET and RBE distributions in addition to  We evaluated the range for a total of 28 different SOBPs, where the range differences between FLUKA and measurements were below 1 mm for all except two SOBPs. Similar accuracy for DS in other MC commissioning studies have been reported for Geant4 (Paganetti et al 2004, Kim et al 2012, TOPAS (Testa et al 2013, Shin et al 2017, Liu et al 2019 and MCNPX (Sayah et al 2013). The modulation width differences were below 5 mm for all cases, and below 2 mm for 79% of the cases. The largest modulation width differences occurred mainly for SOBPs with long modulation widths, leading to a relatively flat shoulder in the proximal region. This can result in inaccuracies in determining the correct position of the 90% PD for simulations; due to a higher statistical uncertainty in determining the position (Testa et al 2013), and for measurements; since a slight difference in dose normalization or measurement setup may lead to large differences in the 90% PD (Lu and Flanz 2011). This could partly explain the increase we observed for the modulation width differences. We chose to evaluate the modulation widths using the same definition as applied at the UFHPTI, i.e. the distance between the 90% DD and 90% PD. However, an alternative approach is to define the modulation width between the 90% DD and the 98% PD where the gradient is steeper (Engelsman et al 2009).
By using a mean value for the energy adjustment for each option when calibrating the range for the Bragg peaks, we achieved overall range differences between FLUKA and measurements below 0.9 mm for the PBPs and below 1.6 mm for the SOBPs. To calibrate the ranges, we could have also adjusted parameters such as ionization potentials and material densities. However, these are parameters with potentially high uncertainties. For example, the densities used for the materials in the geometry implementation were provided by the vendor, where the exact material densities used in a nozzle may often not be known to a sufficient accuracy (Paganetti 2012). Even a slight discrepancy may introduce significant range differences (Bednarz et al 2011). We also maintained the standard values for ionization potentials defined in FLUKA, originating from the ICRU reports. For example, the ionization potential for water was set to 75 eV; however, there have been reports of ionization potentials for water between 75 and 81.8 eV, leading to potential range uncertainties (Kumazaki et al 2007). Therefore, the adjustments were restricted to beam energies only, in order to avoid extended error propagation.
To obtain the FLUKA dose in absolute units, we obtained the direct relationship between the number of PPs and the delivered dose (Clasie et al 2011). Another commonly used method is to relate the MC dose to the MU machine output to obtain the MC specific output factors. This is done by simulating the charge collected by the ionization chamber in the nozzle. However, the measurement volume in the ionization chamber is typically small and is filled with air which has low density. This therefore requires a substantial amount of PPs to be simulated to obtain a chamber response with sufficient statistics (Paganetti 2006). The output factor dependency in passive scattering proton therapy can be expressed as functions of a single factor, r = (R-M)/M, where R and M is the range and modulation width, respectively. In our case, we defined the absolute dose calibration curves as functions of modulation width and defined separate curves for small range spans, as we observed a more well-defined relationship between the number of PP and delivered dose in this case. Furthermore, some curves were very similar for different range spans within an option (e.g. option B4 in figure 10). However, we chose to keep separate curves within each option, as they were very sensitive to small changes, whereas using combined curves lead to increased dose differences compared with the TPS dose.
To achieve a mean statistical uncertainty below 1% in the target region, we simulated each patient treatment field using a total of 300 million PPs. This resulted in an average of 800 CPU-hours for each patient on our fastest CPU, an Intel Xeon Gold 6248, or 10 h distributed among the 80 available cores. In DS proton therapy, the majority of the simulation time is used to track the particles within the nozzle. Furthermore, for an aperture with a small opening of 3 cm in diameter, the efficiency can be as low as 0.7% increasing to 17.7% for a 15 cm diameter aperture opening . For pencil beam scanning proton therapy, the beam does not interact with considerable amounts of geometry making the efficiency substantially higher. It is also possible to further reduce the simulation time for pencil beam scanning by creating a phase space file and reuse it for different patients and treatment fields that use the same treatment setup. The phase space approach is not practical to use in DS proton therapy as a measure to reduce the simulation time since the nozzle setup is typically completely different for each treatment field. The efficiency in DS proton therapy can, however, be improved by techniques such as geometrical particle splitting and Russian roulette (Eulitz et al 2019a, Ramos-Méndez et al 2013, Ramos Méndez et al 2015. However, phase space simulations do have another purpose in DS proton therapy. In order to reduce the lateral scattering, the air gap (distance between the range compensator and the patient) is kept as small as possible. Depending on the MC code, this may result in an overlap between patient specific components and the patient geometry, potentially leading to problems related to particle tracking in overlapping regions. This scenario occurred for one of our patients. We solved this by cropping/removing irrelevant areas surrounding the patient, thus avoiding overlap. However, when simulating a large number of patients, cropping CT images in the MC code is impractical and time consuming. Therefore, by separating the tracking of particles within the nozzle into a phase space file containing particle type, energy, position, and direction, the dose simulation can be started within the patient geometry, avoiding unambiguity due to overlapping regions (Paganetti et al 2008).
The range of estimated mean LET d values for the two posterior fossa tumor patients was 2.3-4.4 keV µm −1 in the brainstem and 2.4-2.5 keV µm −1 in the target region. Similar ranges have been reported in a study of central nervous system (CNS) injury for pediatric medulloblastoma patients treated with DS proton therapy (Giantsoudi et al 2016). In their study, the authors did not find any statistically significant differences in LET d values between patients with and without CNS injury. In our case, the patient receiving a mean LET d of 4.4 keV µm −1 in the brainstem did also experience symptomatic brainstem injury. This may be a coincidence, but a thorough investigation for similar patients is warranted.

Conclusions
We implemented and commissioned a DS proton therapy nozzle into the FLUKA MC code. Methods for the entire process was described, from solutions on how to implement complex geometries such as rotating range modulator wheels as well as apertures and compensators, to the procedure of calculating variable RBE and LET. Overall, we achieved excellent agreement between the MC code and measurements and also accomplished good agreement between MC and the TPS for two patient cases. Considering that the UFHPTI has one of the most extensive cohorts of patients treated with proton therapy, the nozzle implementation presents a powerful tool that enables correlation studies between variable RBE/LET and long-term follow-up data.

Declarations of interest
None.